Problem with HTTP2.0 CONNECT requests (Document loading failed)

Have been using Nextcloud with Collabora for years without trouble. Suddenly some users have problems and see Document loading failed. Found a way to reproduce this: Firefox loads the doc fine, Chrome fails.

This is new.

Drilling down, the most likely reason is a difference in the incoming request. To clarify I use lighttpd on the gateway which forwards requests to the back end server and it has held tis config for years:

        $HTTP["host"] =~ "^cloud\." {
            proxy.server  = ( "" => ( ( "host" => ip_cloud ) ) )
        }
        else $HTTP["host"] =~ "^office\." {
            # Collabora essentially has two types of communication both over the port it listens on
            # (9980). Websockets, that need the Upgrade header and Connection header set and static
            # files that don't. They provide detailed expamples for Apache and Nginx considering every
            # form of URL they want, but we can distil two categories easily enough.
            $REQUEST_HEADER["Upgrade"] == "websocket" {
                setenv.add-request-header = ("Connection" => "Upgrade")
                proxy.header = ( "upgrade" => "enable" )
                proxy.server = ( "" => ( ( "host" => ip_cloud, "port" => port_cool, "upgrade" => "enable" ) ) )
            } else {
                proxy.server = ( "" => ( ( "host" => ip_cloud ) ) )
            }
        } else {
            proxy.server  = ( "" => ( ( "host" => ip_live ) ) )
        }

Now up front yes, I understand the world tends to use Apache, the nginx, and you may not be familiar with lighttpd configs and nor am I needing that directly, much rather I’m looking to explain how things worked for years comfortably and why I believe they’ve suddenly broken and how that is diagnosed.

So up front the above config is on a gateway and simply takes and cloud. URLs and forwards them on to the backend server, and handles office. URLS with more nuance as follows. If “Upgrade: websocket” appears in the HTTP headers forward on to coolwsd server, FYI:

var.port_cool = "9980"

If it doesn’t have “Upgrade: websocket” in hte HTTP headers it’s s tatic file request and it forwards on to the webserver on ip_cloud which delivers the static files.

Finally, if it’s not cloud. or office. it just forwards on to a generic webserver on hte backed (at ip_live).

How did I diagnose the issue:

By examining lighttpd tracing logs and coolwsd tracing logs. What that reveals:

  1. Firefox sends an HTTP 1.1 GET request with “Upgrade: websocket” and all is good.
  2. Chrome sends an HTTP 2.0 CONNECT request which isn’t handled.

I infer this is a new thing in Chrome (and perhaps other clients) and the consequence is the CONNECT request is just sent unaltered to the webserver at ip_cloud not the coolwsd’s websocket on port_cool.

My questions are simple:

  1. Does this sound familiar, has anyone else encountered this?
  2. When I try to forward the CONNECT requests to coolwsd it seems not to work alas. Rally I just add a new condition:
        $HTTP["host"] =~ "^cloud\." {
            proxy.server  = ( "" => ( ( "host" => ip_cloud ) ) )
        }
        else $HTTP["host"] =~ "^office\." {
            # Collabora essentially has two types of communication both over the port it listens on
            # (9980). Websockets, that need the Upgrade header and Connection header set and static
            # files that don't. They provide detailed expamples for Apache and Nginx considering every
            # form of URL they want, but we can distil two categories easily enough.
            $REQUEST_HEADER["Upgrade"] == "websocket" {
                setenv.add-request-header = ("Connection" => "Upgrade")
                proxy.header = ( "upgrade" => "enable" )
                proxy.server = ( "" => ( ( "host" => ip_cloud, "port" => port_cool, "upgrade" => "enable" ) ) )
            } else $HTTP["request-method"] == "CONNECT" {
                proxy.server = ( "" => ( ( "host" => ip_cloud, "port" => port_cool, "upgrade" => "enable" ) ) )
            } else {
                proxy.server = ( "" => ( ( "host" => ip_cloud ) ) )
            }
        } else {
            proxy.server  = ( "" => ( ( "host" => ip_live ) ) )
        }

That is, if the request method is CONNECT forward to port_cool. Alas, lighttpd is getting a 405 (Method Not Allowed) back from coolwsd and forwarding that to the client it seems. At least that is my provisional diagnosis.

Which is why the second question is really:

Is there come coolwsd config I need to attend to for it to accept HTTP 2.0 CONNECT requests on its port 9980? or worse, have I misunderstood something profound?

Thank you so much @bernd-wechner for your thorough testing and detailed report — much appreciated! :raising_hands:

I’ve created an issue in the GitHub repository to track this:
https://github.com/CollaboraOnline/online/issues/12155

If you could leave a comment there, you’ll be subscribed to updates, so you’ll get notified as soon as there’s progress or a fix is released.

Thanks again for your valuable contribution!

Thanks
Darshan

Thank you for your prompt reply and transferral to github!

A mild shame I admit that it’s not just a simple configuration issue I could fix alas.

Part of me is thinking I might have to find a way to change the request before forwarding it to the backend server. I haven’t researched how feasible that is yet though. But of course if there is a fix at the Collabora end, either a configuration or new version I’ll be happy too … It’s a bit of downer to find my some users unable to open docs, and took quite some while to nail a likely reason (was traipsing through verbose logs with Gemnini’s help - it can find and report oddities quite well and testing on my phone and then other browsers - first reproduced it with Android Firefox but alas it has no debug console I can find, so was pleased when it reproduced in Chrome on the desktop where I could watch the debug console and network tab on the Dev tools).

I can confirm also, that if I start Chrome with:

/usr/bin/google-chrome-stable  --disable-http2

then Chrome does load the document! No surprise, but a test I tried just out of interest. Evidencing that it’s a likely recent change that Chrome defaults to HTTP2.0 and Firefox doesn’t yet, and that HTTP 1.1 connections are handled fine.

A workaround I have found that works on lighttpd in any case is to downgrade the CONNECT effort:

        $HTTP["host"] =~ "^cloud\." {
            proxy.server  = ( "" => ( ( "host" => ip_cloud ) ) )
        }
        else $HTTP["host"] =~ "^office\." {
            # Collabora essentially has two types of communication both over the port it listens on
            # (9980). Websockets, that need the Upgrade header and Connection header set and static
            # files that don't. They provide detailed expamples for Apache and Nginx considering every
            # form of URL they want, but we can distil two categories easily enough.
            $REQUEST_HEADER["Upgrade"] == "websocket" {
                proxy.header += ( "upgrade" => "enable" )
                proxy.server = ( "" => ( ( "host" => ip_cloud, "port" => port_cool ) ) )
            } else $HTTP["request-method"] == "CONNECT" {
                proxy.header += ( "upgrade" => "enable" )
                proxy.header += ("force-http10" => "enable")
                proxy.server = ( "" => ( ( "host" => ip_cloud, "port" => port_cool ) ) )
            } else {
                proxy.server = ( "" => ( ( "host" => ip_cloud ) ) )
            }
        } else {
            proxy.server  = ( "" => ( ( "host" => ip_live ) ) )
        }

This has Collabora up and working again on our systems on Chrome and Android Firefox in any case which were my two failing test beds.

1 Like