Anatomy of a Web Cache Poisoning Attack

What is a web cache?

A web cache is a temporary storage system that stores copies of web resources (like HTML pages, images, JavaScript files, etc.) to improve performance and reduce bandwidth usage.

When a user requests a web resource, instead of fetching it directly from the origin server every time, the web cache checks if it has a valid cached copy of that resource. If it does, the cache serves the cached copy, which is much faster than requesting it from the server. If not, it fetches the resource from the origin server, caches it, and then serves it to the user.

There are different types of web caches:

Browser Cache: Modern web browsers have built-in caching to store static web content locally on the user’s device.
Proxy Cache: A proxy server can cache web resources for multiple users, reducing bandwidth for the organization.
Reverse Proxy Cache: This sits in front of web servers and caches content for delivery to web clients.
Content Delivery Network (CDN) Cache: CDN providers have caching servers distributed globally to serve content closest to users.

Web caches work based on HTTP caching headers and heuristics about resource freshness. The cache control headers allow origin servers to instruct caches on cacheability and cache expiry.

Benefits of web caching include:

Reduced bandwidth costs
Faster content delivery
High availability and redundancy
Offload traffic from origin servers

However, caching also introduces potential security issues like cache poisoning, necessitating cache validation and other hardening measures.

Overall, web caching is a critical optimization for delivering high-performance, scalable web applications while carefully managing its security implications.

Web-cache-workflow

How does it work?

Web caches work by temporarily storing copies of web resources (HTML pages, images, JavaScript files, etc.) so that they can be quickly retrieved and served to clients without having to fetch them from the origin server each time. Here’s a general overview of how web caching works:

Client Request: A client (web browser or other user agent) sends an HTTP request for a specific web resource to a web server.
Cache Check: Before forwarding the request to the origin server, any intermediate caches (browser cache, proxy cache, CDN cache, etc.) check if they have a fresh/valid copy of the requested resource stored locally.
Cache Hit or Miss:

Cache Hit: If a valid cached copy exists, the cache retrieves the cached resource and returns it directly to the client, without forwarding the request to the origin server.
Cache Miss: If there is no cached copy, or the cached copy has expired, the request gets forwarded to the origin server.

Origin Server Response: If there was a cache miss, the origin server processes the request and sends the requested resource back in the HTTP response.
Cache Storage: Any intermediate caches inspect the response headers (like Cache-Control, Expires, ETag) and caching rules to determine if and for how long they should cache the resource locally.
Cache Expiration: Cached resources have an associated expiry time or validation strategy. Once expired, the cache will treat the resource as stale and fetch a new copy from the origin on the next request.

Caches use various mechanisms like maximum age, validation tokens (ETags), and heuristic calculations to determine cache freshness and expiration. Smart caching algorithms try to keep frequently accessed resources cached while evicting stale or unused entries.

The caching process is designed to be transparent – clients receive the same resource data regardless of whether it came from the cache or origin. Caching reduces latency, bandwidth usage, and load on the origin, improving web performance and scalability.

How does it work?

A cache key is a unique identifier that the cache uses to store and retrieve cached resources. Cache keys are essential for efficient cache management and ensuring that the correct cached response is served for each client request.

Cache keys typically consist of several components that help the cache distinguish between different resources and versions of those resources. These components can include:

URL: The requested URL is often the primary component of the cache key, as it uniquely identifies the resource being requested.
Query Parameters: If the URL includes query parameters, they may be included in the cache key to differentiate between different variations of the same resource.
HTTP Headers: Certain HTTP headers, such as the Host header, Accept-Encoding, or Cache-Control, may be included in the cache key to account for variations in the requested resource based on these headers.
Cookies: For resources that depend on cookie values (e.g., personalized content), the cache key may include relevant cookie values to ensure that the correct cached version is served based on the user’s cookies.
User Agent: In some cases, the User-Agent header may be included in the cache key to cache different versions of a resource for different types of clients (e.g., desktop vs. mobile).
Vary Header: The Vary HTTP response header can instruct the cache to create different cached versions of a resource based on specific request headers, which are then incorporated into the cache key.

let’s examine how cache keys work with an example HTTP request

Suppose a client (web browser) sends the following HTTP request:


GET /products?category=electronics&sort=price HTTP/1.1
Host: www.example.com
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3
Cookie: session_id=abc123

In this request, the client is asking for the /products resource with query parameters category=electronics and sort=price. The request also includes additional headers like Host, Accept-Encoding, User-Agent, and a Cookie.

To construct the cache key for this request, the cache may use the following components:

URL: /products
Query Parameters: category=electronics&sort=price
Host Header: www.example.com
Vary Header (if present in the response from the origin server)
Cookie: session_id=abc123

The cache key could then look something like this:

/products?category=electronics&[email protected]@session_id=abc123

When the cache receives this request, it will check if it has a cached response associated with this cache key. If a valid cached response exists, the cache will serve that response directly to the client.

If the cache does not have a valid cached response for this key, it will forward the request to the origin server, cache the response from the server using this key, and then serve the response to the client.

The cache key ensures that the cache can differentiate between different versions of the same resource based on the specific request parameters and headers. For example, if another client sends a request for /products?category=electronics&sort=name, the cache would generate a different key and treat it as a separate resource.

By using cache keys that incorporate relevant request components, caches can efficiently store and retrieve the appropriate cached responses, improving performance and reducing the load on the origin server.

Detection of cache

To check if caching is enabled for a website, you can use several methods:

Browser Developer Tools:

Open the website in your web browser.
Right-click anywhere on the page and select “Inspect” or press Ctrl+Shift+I (Cmd+Option+I on Mac) to open the developer tools.
Go to the “Network” tab.
Reload the page (Ctrl+R or Cmd+R) to capture network requests.
Look for the resources (HTML, CSS, JavaScript, images, etc.) in the list.
Check the “Size” and “Content” columns for each resource. If the resources have a status of “200 OK” and a non-zero size, it indicates that caching is enabled for those resources. Cached resources may also have a status of “from memory cache” or “from disk cache”.

HTTP Headers:

You can inspect the HTTP response headers of the website’s resources to check for caching directives.
Use browser extensions like “HTTP Headers” for Chrome or “Live HTTP Headers” for Firefox to view response headers.
Look for caching-related headers such as Cache-Control, Expires, Last-Modified, and ETag.
If the Cache-Control header is present with directives like public, private, no-cache, no-store, max-age, etc., it indicates caching behavior.

Online Tools:

There are online tools available that analyze HTTP headers and provide information about caching.
Websites like [webpagetest.org](https://web.archive.org/web/20250518003443/https://www.webpagetest.org/) or developers.google.com/speed/pagespeed/insights/ can provide insights into caching behavior and optimization suggestions.

Caching Directives in HTML:

Sometimes, websites include meta tags or directives in their HTML code to control caching behavior.
- Look for meta tags like **<meta http-equiv="Cache-Control" content="max-age=3600">** or **<meta http-equiv="Expires" content="Wed, 21 Oct 2026 07:28:00 GMT">** in the HTML source.

By using these methods, you can determine whether caching is enabled for a website and gain insight into the caching behavior of its resources.

Unkeyed inputs

Inputs that aren’t included in the cache key are called unkeyed inputs. You can use param miner to find the unkeyed inputs.

Exploits

Poisoning with an unkeyed header

When the value of an unkeyed header is reflected in response. It can be tested for poisoning.

The website’s response is reflecting the X-Forwarded-For header without any sanitization, which means it could be vulnerable to basic XSS payloads. By injecting such a payload, the cache can be poisoned, putting anyone visiting the page at risk of XSS attacks.

GET /dashboard HTTP/1.1
Host: vulnerable.com
X-Forwarded-Host: a."><script>alert(1)</script>"

Poisoning via an unkeyed cookie

Cookies also can be reflected on the response of a page. We can abuse it by injecting XSS.

GET / HTTP/1.1
Host: vulnerable.com
Cookie: session=VftzO7ZtiBj5zNLRAuFpXpSQLjS4lBmU; fehost=asd"%2balert(1)%2b"

Poisoning via unkeyed query string

Similar to the previous vulnerabilities, when the query parameter is unkeyed, it could be used to inject a payload.

GET //?"><script>alert(1)</script> HTTP/1.1
Host: redacted.net
HTTP/1.1 200 OK

<meta property="og:url" content="//redacted-newspaper.net//?x"><script>alert(1)</script>"/>

Poisoning by path traversal confusion

If caching settings are tailored to specific folders or file types, there’s a potential vulnerability that could be exploited. By generating a distinct link not targeting page parameters, but instead appending static extensions like JS or CSS, attackers may find a loophole. Alternatively, they might attempt to access directories designated for aggressive caching, such as /static/, /js/, or /upload/. Read this write-up for a detailed exploitation of such vulnerability.

Redirect DoS

what would you do if there’s an unkeyed query string that’s not vulnerable to XSS?

Let’s understand it using cloudflare.com

Cloudflare’s login page redirects via /login/. By this, we can confirm that it’ll be unkeyed.

GET /login?x=very-long-string... HTTP/1.1
Host: www.cloudflare.com
Origin: https://dontpoisoneveryone/

Then when someone else tries to visit the login page, they’ll naturally get a redirect with a long query string:

GET /login HTTP/1.1
Host: www.cloudflare.com
Origin: https://dontpoisoneveryone/
HTTP/1.1 301 Moved Permanently
Location: /login/?x=very-long-string...

This request will be blocked by the server. So with one request, we can persistently take down this route to Cloudflare’s login page.

Poisoning via Unkeyed method

Another method to conceal parameters from the cache key involves sending a POST request. Some systems overlook including the request method in the cache key.

Parameter cloaking

When attackers can separate query parameters with a semicolon (;), they can exploit a disparity in request interpretation between the default-configured proxy and the server. Consequently, malicious requests might be cached as completely safe ones. Typically, the proxy doesn’t recognize the semicolon as a separator, thus failing to include it in a cache key for an unkeyed parameter. This oversight allows attackers to bypass caching mechanisms and potentially execute malicious actions.

GET /?u=legitimate&content=1;u=malicious HTTP/1.1
Host: example.com
Upgrade-Insecure-Requests: 1

The server sees 3 parameters here: u, content and then u again. On the other hand, the proxy considers this full string: 1;u=malicious as the value of content, which is why the cache key would only contain somesite.com/?u=legitimate.

Poisoning via fat get request

This web application employs a caching system. Through the technique of sending a GET request with a request body (a “fat” GET request), it became feasible to compel the caching system to cache a response containing user-controlled input. Subsequently, this cached response could be served to a victim, leading to various vulnerabilities.

This scenario becomes feasible only if a website accepts GET requests with a body, although there are potential workarounds. One workaround involves inducing “fat GET” handling by overriding the HTTP method. For instance:

GET /?foxy=iota HTTP/1.1
Host: github.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 22
report=innocent-victim

How can we help?

At Vulncure, our team of cybersecurity experts can help safeguard your organization against the threats of web cache poisoning attacks. Through comprehensive assessments and testing, we identify vulnerabilities in your web applications, cache implementations, and infrastructure that could be exploited for malicious cache injections. Our services provide tailored guidance on fortifying cache validation, key generation, and purging mechanisms using industry best practices. Partner with us to ensure your web caching infrastructure remains secure and resilient, protecting your systems and users from cache poisoning and other web-based attacks. Contact us today to strengthen your security posture and schedule a meeting to discuss your specific needs.

Conclusion

In conclusion, while web caching offers significant performance benefits, it also introduces security risks if not properly implemented. Web cache poisoning attacks allow malicious actors to inject content that gets served to users, leading to severe consequences like defacement, data theft, and malware distribution.

Mitigating these risks requires robust input validation, secure coding practices, strict cache validation rules, and secure key generation algorithms. Regular security assessments and penetration testing are crucial to identify vulnerabilities in caching infrastructures.

In today’s landscape, web cache poisoning poses a significant threat that cannot be ignored. By following best practices, implementing robust controls, and partnering with security experts, organizations can leverage caching safely while protecting their systems, data, and users.

Prioritize the security of your web caching infrastructure, remain vigilant, and take a proactive approach to defend against the dangers of cache poisoning attacks. Remember, complacency can leave you vulnerable to emerging threats.

References

About me

Hello, I’m Harshi Gupta, a seasoned penetration tester with expertise in both internal and external assessments. Cybersecurity is not just a career path for me; it’s my hobby and passion. With a wealth of experience in identifying and mitigating security vulnerabilities, I am dedicated to ensuring the resilience of organizations’ digital assets. For networking opportunities and engaging discussions, feel free to reach out to me via LinkedIn and Twitter.

Anatomy of a Web Cache Poisoning Attack

What is a web cache?

How does it work?

How does it work?

Detection of cache

Unkeyed inputs

Exploits

How can we help?

Conclusion

References

About me

Share this article

You might also like

What a Regular Pentest Can Reveal About Your Web App