Apoc
Apoc
CDCloudflare Developers
Created by Apoc on 2/11/2025 in #workers-help
Tiered Cache not working due to bot management cookie
We are running into an issue with tiered cache due to the bot management cookie being sent when enabled (__cf_bm) Just a brief topology to help explain: - We have 2 domains: www.xxx.com and proxy.xxx.com (same domain, just different subdomains) - www.xxx.com is where all requests enter from normal users, and where we have our worker running - proxy.xxx.com is where the above worker calls to fetch pages, resources, etc. (Just a reverse proxy essentially) This is not another worker, it is essentially the origin server. On our staging domain, we do not have bot management enabled, thus we do not get a cookie sent in the response from proxy.xxx.com. Everything works as expected. Our cache is hit, we get Age headers as we expect, etc. No issues. On our production domain, we have bot management enabled, which causes the responses from proxy.xxx.com to include the __cf_bm cookie, and seems to instruct Cloudflare not to cache the response, because it has a Set-Cookie header, completely defeating the purpose of us enabling tiered cache. Is there a way to work around this issue? It makes tiered cache unusable for us among other issues.
1 replies
CDCloudflare Developers
Created by Apoc on 11/9/2023 in #workers-help
Workers fetch cache (tiered caching) - How to handle conditional caching?
I have a use-case where we'd like to utilize tiered caching instead of the cache API to help shield our origins from requests, but I'm struggling to replicate the same logic using the fetch cache. Our worker essentially acts as a reverse proxy, handling internal routing & query parameters to the origins (multiple servers). This is a very slimmed down version of our logic using the cache API:
export async function getReverseProxyResponseCacheApi(request: Request, requestRules: RequestRules, pageProviderUrl: string): Promise<Response> {
// Disable caching when requesting on an internal IP
const useCache = !SecurityService.isBusinessIpRequest(request);

const cache = caches.default;

// Appends various information to the URL to ensure that the cache key is unique. We include internal data
// such as experiments, device, region, currency info, etc.
const cacheKey = CacheKeyService.getCacheKey(request, pageProviderUrl);

let page = useCache ? await cache.match(cacheKey) : undefined;

if (!page) {
const pageProviderResponse = await fetch(pageProviderUrl);
page = new Response(pageProviderResponse.body, pageProviderResponse);

if (useCache && page.ok) {
const cacheControl = pageProviderResponse.headers.get('cache-control');
// If the cache is not public, or set to no-store, don't cache
if (cacheControl && !cacheControl.includes('no-store') && cacheControl.includes('public')) {
page.headers['Cache-Tag'] = CacheKeyService.getCacheTags(request, requestRules);

// Respects the cache-control header TTL that was present on the origin response
await cache.put(cacheKey, page);
}
}
}

// If the page still isn't found, throw an error
if (!page) {
throw new Error('Page not found');
}

return page;
}
export async function getReverseProxyResponseCacheApi(request: Request, requestRules: RequestRules, pageProviderUrl: string): Promise<Response> {
// Disable caching when requesting on an internal IP
const useCache = !SecurityService.isBusinessIpRequest(request);

const cache = caches.default;

// Appends various information to the URL to ensure that the cache key is unique. We include internal data
// such as experiments, device, region, currency info, etc.
const cacheKey = CacheKeyService.getCacheKey(request, pageProviderUrl);

let page = useCache ? await cache.match(cacheKey) : undefined;

if (!page) {
const pageProviderResponse = await fetch(pageProviderUrl);
page = new Response(pageProviderResponse.body, pageProviderResponse);

if (useCache && page.ok) {
const cacheControl = pageProviderResponse.headers.get('cache-control');
// If the cache is not public, or set to no-store, don't cache
if (cacheControl && !cacheControl.includes('no-store') && cacheControl.includes('public')) {
page.headers['Cache-Tag'] = CacheKeyService.getCacheTags(request, requestRules);

// Respects the cache-control header TTL that was present on the origin response
await cache.put(cacheKey, page);
}
}
}

// If the page still isn't found, throw an error
if (!page) {
throw new Error('Page not found');
}

return page;
}
However, with the fetch cache, we don't have the same control as far as I can tell.
export async function getReverseProxyResponseFetchCacheApi(request: Request, requestRules: RequestRules, pageProviderUrl: string): Promise<Response> {
const useCache = SecurityService.isBusinessIpRequest(request);
const cacheKey = CacheKeyService.getCacheKey(request, pageProviderUrl);

let page = await fetch(pageProviderUrl, {
cf: {
cacheTtl: useCache ? requestRules.reverseProxyTtl : undefined,
cacheTtlByStatus: {
'300-599': 0,
},
cacheKey: cacheKey,
cacheTags: [CacheKeyService.getCacheTags(request, requestRules)],
},
});

page = new Response(page.body, page);

// Cache-Control is completely ignored by the tiered cache since we provided a TTL
const cacheControl = page.headers.get('cache-control');

// If the cache is not public, or set to no-store, don't cache
if (!page.ok || (cacheControl && (!cacheControl.includes('public') || cacheControl.includes('no-store')))) {
// How do we handle this?
// await caches.default.delete(cacheKey);
return page;
}

return page;
}
export async function getReverseProxyResponseFetchCacheApi(request: Request, requestRules: RequestRules, pageProviderUrl: string): Promise<Response> {
const useCache = SecurityService.isBusinessIpRequest(request);
const cacheKey = CacheKeyService.getCacheKey(request, pageProviderUrl);

let page = await fetch(pageProviderUrl, {
cf: {
cacheTtl: useCache ? requestRules.reverseProxyTtl : undefined,
cacheTtlByStatus: {
'300-599': 0,
},
cacheKey: cacheKey,
cacheTags: [CacheKeyService.getCacheTags(request, requestRules)],
},
});

page = new Response(page.body, page);

// Cache-Control is completely ignored by the tiered cache since we provided a TTL
const cacheControl = page.headers.get('cache-control');

// If the cache is not public, or set to no-store, don't cache
if (!page.ok || (cacheControl && (!cacheControl.includes('public') || cacheControl.includes('no-store')))) {
// How do we handle this?
// await caches.default.delete(cacheKey);
return page;
}

return page;
}
Is there any way to work around this that doesn't involve using the cache purge API from within the worker? (250k purges per day would quickly be hit)
2 replies