TransWikia.com

duplicate cache pages: Varnish

Server Fault Asked by Sukhjinder Singh on November 16, 2021

Recently we have configured Varnish on our server, it was successfully setup but we noticed that if we open any page in multiple browsers, the Varnish send request to Apache not matter page is cached or not. If we refresh twice on each browser it creates duplicate copies of the same page.

What exactly should happen:

If any page is cached by Varnish, the subsequent request should be served from Varnish itself when we are opening the same page in browser OR we are opening that page from different IP address.

Following is my default.vcl file

backend default {
    .host = "127.0.0.1";
    .port = "80";
}

sub vcl_recv {
    if( req.url ~ "^/search/.*$")
    {
    }else {
        set req.url = regsub(req.url, "?.*", "");
}

if (req.restarts == 0) {
    if (req.http.x-forwarded-for) {
        set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip;
    } else {
        set req.http.X-Forwarded-For = client.ip;
    }
}

if (!req.backend.healthy) {
    unset req.http.Cookie;
}

set req.grace = 6h;

if (req.url ~ "^/status.php$" ||
        req.url ~ "^/update.php$" ||
        req.url ~ "^/admin$" ||
        req.url ~ "^/admin/.*$" ||
        req.url ~ "^/flag/.*$" ||
        req.url ~ "^.*/ajax/.*$" ||
        req.url ~ "^.*/ahah/.*$") {
            return (pass);
}

if (req.url ~ "(?i).(pdf|asc|dat|txt|doc|xls|ppt|tgz|csv|png|gif|jpeg|jpg|ico|swf|css|js)(?.*)?$") {
    unset req.http.Cookie;
}

if (req.http.Cookie) {
    set req.http.Cookie = ";" + req.http.Cookie;
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");    
    set req.http.Cookie = regsuball(req.http.Cookie, ";(SESS[a-z0-9]+|SSESS[a-z0-9]+|NO_CACHE)=", "; 1=");
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

        if (req.http.Cookie == "") {
            unset req.http.Cookie;
        }
        else {
            return (pass);
        }
}

if (req.request != "GET" && req.request != "HEAD" &&
    req.request != "PUT" && req.request != "POST" &&
    req.request != "TRACE" && req.request != "OPTIONS" &&
    req.request != "DELETE") 
    {return(pipe);}     /* Non-RFC2616 or CONNECT which is weird. */

if (req.request != "GET" && req.request != "HEAD") {
    return (pass);
}

if (req.http.Accept-Encoding) {
    if (req.url ~ ".(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") {
        # No point in compressing these
        remove req.http.Accept-Encoding;
    } else if (req.http.Accept-Encoding ~ "gzip") {
            set req.http.Accept-Encoding = "gzip";
    } else if (req.http.Accept-Encoding ~ "deflate") {
            set req.http.Accept-Encoding = "deflate";
    } else {
        # unknown algorithm
        remove req.http.Accept-Encoding;
    }
}
    return (lookup);
}

sub vcl_deliver {
    if (obj.hits > 0) {
        set resp.http.X-Varnish-Cache = "HIT";
    }
    else {
        set resp.http.X-Varnish-Cache = "MISS";
    }
}

sub vcl_fetch {
    if (beresp.status == 404 || beresp.status == 301 || beresp.status == 500) {
        set beresp.ttl = 10m;
}
if (req.url ~ "(?i).(pdf|asc|dat|txt|doc|xls|ppt|tgz|csv|png|gif|jpeg|jpg|ico|swf|css|js)(?.*)?$") {
    unset beresp.http.set-cookie;
}
    set beresp.grace = 6h;
}

sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    return (hash);
}

sub vcl_pipe {
    set req.http.connection = "close";
}

sub vcl_hit {
    if (req.request == "PURGE") 
        {ban_url(req.url);
    error 200 "Purged";}

    if (!obj.ttl > 0s)
        {return(pass);}
}

sub vcl_miss {
    if (req.request == "PURGE") 
        {error 200 "Not in cache";}
}

Solution

Pitfall – Vary: User-Agent

Some applications or application servers send Vary: User-Agent along with their content. This instructs Varnish to cache a separate copy for every variation of User-Agent there is. There are plenty. Even a single patchlevel of the same browser will generate at least 10 different User-Agent headers based just on what operating system they are running.

So if you really need to Vary based on User-Agent be sure to normalize the header or your hit rate will suffer badly. Use the above code as a template.

https://www.varnish-cache.org/docs/3.0/tutorial/vary.html#tutorial-vary

Workaround

One workaround, is to do what we call “User-Agent-Washing”, where
Varnish rewrites the Useragent to the handfull of different variants
your backend really cares about, along the lines of:

sub vcl_recv {
       if (req.http.user-agent ~ "MSIE") {
           set req.http.user-agent = "MSIE";
   } else {
           set req.http.user-agent = "Mozilla";
   }
}

2 Answers

First thing is that it's impossible for varnish to cache 2 copies of a URL.

Now, I am not sure about the Hit/Miss check, but when I need to check, I will do that in the firefox and use Firebug for that.

I will open the firebug and open the website.

In that, it will show the age of the every page/image fetched, like shown in the picture attached.

If age is increasing by time, then for me Varnish is working pretty well.

And what I can see, it's working fine for your site too.

How to check Varnish Cache working

Answered by Napster_X on November 16, 2021

This is what helped me solve this problem:

Uncomment Comment out or delete the lines that form your vch_hash function and restart varnish. vcl_hash is used to create specific caches, say for a user or a session or a certain IP address. If you want the page to be served from (after it has been cached) cache, you can do away with the vcl_hash function.

Test it out in a test environment first, just in case.

HTH.

Edit (Clarification)

Option 1:

Comment out these lines by adding "#" sign at the beginning of each line. So these lines

sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    return (hash);
}

would become:

#sub vcl_hash {
#    hash_data(req.url);
#    if (req.http.host) {
#        hash_data(req.http.host);
#    } else {
#        hash_data(server.ip);
#    }
#    return (hash);
#}

Lines beginning with a # are ignored by varnish.

Option 2:

Alternatively, you can remove the above lines all together. The end result is the same.

Answered by KM. on November 16, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP