Cloudflare Developers•2mo ago

Deadlock in cache.put() - platform issue or am I doing something wrong?

The following worker code results in a deadlock (no response is ever sent to the client), both in a local Wrangler instance and in the Playground:

export default {
    async fetch(request, env, ctx) {
        let myResponse = await fetch('http://example.com/');
        myResponse = new HTMLRewriter()
            .on('h1', {
                async element(e) {
                    await caches.default.put(new Request('http://example.com/dummy'), new Response('foo'));
                    e.setInnerContent('Rewritten');
                }
            })
            .transform(myResponse);
        caches.default.put(request, myResponse.clone());
        return myResponse;
    },
};

export default {
    async fetch(request, env, ctx) {
        let myResponse = await fetch('http://example.com/');
        myResponse = new HTMLRewriter()
            .on('h1', {
                async element(e) {
                    await caches.default.put(new Request('http://example.com/dummy'), new Response('foo'));
                    e.setInnerContent('Rewritten');
                }
            })
            .transform(myResponse);
        caches.default.put(request, myResponse.clone());
        return myResponse;
    },
};

This is a minimized version of an issue we have with our worker on Production, the (presumably) same issue is causing some of our requests to never be answered, causing visitor frustration. The issue seems to be triggered by a cache put() call happening while another put() call is waiting for the provided response to complete. As the cache keys are different, I can't see a good reason why the two calls should block each other, though. My question is: does this code try to do something unsupported, or is this supposed to work? It feels like a bug in the worker implementation, but I may be missing something.

6 Replies

Csaba Varga•2mo ago

This variant is also interesting:

export default {
    async fetch(request, env, ctx) {
        let myResponse = await fetch('http://example.com/');
        myResponse = new HTMLRewriter()
            .on('h1', {
                async element(e) {
                    const dummyResponse = new Response('foo');
                    caches.default.put(new Request('http://example.com/dummy'), dummyResponse.clone());
                    console.log('about to call text()');
                    const text = await dummyResponse.text();
                    console.log(`text() == ${text}`);
                    e.setInnerContent('Rewritten');
                }
            })
            .transform(myResponse);
        caches.default.put(request, myResponse.clone());
        return myResponse;
    },
};

export default {
    async fetch(request, env, ctx) {
        let myResponse = await fetch('http://example.com/');
        myResponse = new HTMLRewriter()
            .on('h1', {
                async element(e) {
                    const dummyResponse = new Response('foo');
                    caches.default.put(new Request('http://example.com/dummy'), dummyResponse.clone());
                    console.log('about to call text()');
                    const text = await dummyResponse.text();
                    console.log(`text() == ${text}`);
                    e.setInnerContent('Rewritten');
                }
            })
            .transform(myResponse);
        caches.default.put(request, myResponse.clone());
        return myResponse;
    },
};

Here, I'm not waiting for the put() call to finish, but its presence still interferes with consuming the response. The message "about to call text()" appears on the console, but nothing else is logged, so the code is waiting forever for the body of the dummy response. (This is despite the response body being already in memory, and trivially short.) It's definitely weird that even after cloning the response, the two copies can block each other.

Peps•2mo ago

Hm as far as I'm aware, you can't really use async functions inside HTMLRewriter handlers You could still do something like

            .on('h1', {
                element(e) {
                    ctx.waitUntil(caches.default.put(new Request('http://example.com/dummy'), new Response('foo')));
                    e.setInnerContent('Rewritten');
                }
            })

            .on('h1', {
                element(e) {
                    ctx.waitUntil(caches.default.put(new Request('http://example.com/dummy'), new Response('foo')));
                    e.setInnerContent('Rewritten');
                }
            })

James•2mo ago

You can use async functions with HTMLRewriter as of https://blog.cloudflare.com/asynchronous-htmlrewriter-for-cloudflare-workers, but it seems this hits a runtime bug with something to do with cache: https://github.com/cloudflare/workerd/issues/2498 ctx.waitUntil is probably the best approach for writing to cache and not hitting a deadlock inside of these handlers right now.

Chaika•2mo ago

fwiw just waitUntil isn't enough, if you take his second variant and ctx.waitUntil the cache.put it just gets stuck in the await dummyResponse.text(). If you clone it and then ctx.waitUntil at the end, seems to work regardless of amount of times it runs, weird stuff

James•2mo ago

oh wow, interesting. Good info, thank you

Csaba Varga•5w ago

Hi! Thank you for taking a look at this Yes, in our production code, I worked around the issue by collecting responses in an array and calling cache.put on them after the rewriter was done, but I assume it should work in parallel as well. Looks like that linked issue is exactly the same thing I ran into, so I'm watching it

Gaming

Programming

Deadlock in cache.put() - platform issue or am I doing something wrong?