You can’t force a browser (or other User Agent) to do anything. You must carefully implement your server side code to prevent malicious or accidental damage. That said, you can sometimes improve the user experience a lot by asking browsers nicely not to cache anything and thus to request the page again when (amongst other things) the Back button is used.
You can do this with the following HTTP headers. (Note: This doesn’t work on Opera as it does on other browsers.)
Cache-Control: no-cache Cache-Control: no-store Expires: Thu, 01 Jan 1970 00:00:00 GMT Pragma: no-cache
In PHP you can do this with:
header("Cache-Control: no-cache"); // Forces caches to obtain a new copy of the page from the origin server header("Cache-Control: no-store"); // Directs caches not to store the page under any circumstance header("Expires: " . date('D, d M Y H:i:s', 0) . ' GMT'); //Causes the proxy cache to see the page as "stale" header("Pragma: no-cache"); // HTTP 1.0 backward compatibility
This is not the perfect solution. If the browser ignores these headers (as Opera will – see comments) then you can still go back and see stale pages. I wonder what banks do to get around this where the viewing of a stale page can be considered a security breach?
Opera is not ignoring these headers, if you set Cache-control to no-cache it will not be cached, and no-store it will not be stored on disk. But it is very common to confuse cache, a local copy of a resource as to avoid a server round-trip, and history (going back to the pages you have viewed before.). These are two very different concepts. If you had followed the link to the HTTP 1.1 specification, Section 13.13 you would read:
13.13 History Lists
User agents often have history mechanisms, such as “Back” buttons and history lists, which can be used to redisplay an entity retrieved earlier in a session.
History mechanisms and caches are different. In particular history mechanisms SHOULD NOT try to show a semantically transparent view of the current state of a resource. Rather, a history mechanism is meant to show exactly what the user saw at the time when the resource was retrieved.
By default, an expiration time does not apply to history mechanisms. If the entity is still in storage, a history mechanism SHOULD display it even if the entity has expired, unless the user has specifically configured the agent to refresh expired history documents.
It could hardly be clearer. Cache-directives do not apply to history.
Hey Jonny,
Sorry, my last paragraph is incorrect – it’s not ignoring them it’s just not doing what I expect. I can completely see your point. One might question whether this is desirable behaviour, though it’s certainly in the RFC.
Apart from the “viewing is a security breach” scenario which may be a bit far-fetched, what about the situation where you are storing some state in the page. To resubmit this will result in a failure. So you want to tell Opera: “hey, once they’ve navigated off this page it’s old news – don’t show it to them again”. It seems to me that although a user might want to view pages they have visited, they are more likely to be hitting the back button to get back to “that page where I can“. In this scenario hasn’t Opera failed them as a user experience however perfect it’s implementation of the RFC?
Happy to be wrong if you have any more comment to make! I haven’t actually used Opera for a while so this is all a bit hand-wavy from me!
As you might guess I care about the history function. It is important for usability, and if you flutter back and forth in history, as Opera and more recently Firefox 2+ encourages, you will see why.
I also care about Opera, or for that matter other browsers and servers, complying to HTTP. HTTP is a protocol, and in my view it is more important that this works seamlessly than say a format like HTML (of course you want to do both).
From an efficiency viewpoint it is also a pity that the cache mechanism, the most complex and powerful part of HTTP is largely disabled, partly for valid reasons like yours, but mostly for understandable but unfortunate reasons. If you have ads that pay per view (the most common case) and the ad is cached you make less money. The solution is to specify that just the ad shouldn’t be cached, but this is often hard(er) to do than prevent the entire page from being cached.
But your question is a valid one. HTTP is designed to be (mostly) stateless. This is by design (look up “REST” if you want the ideology), but that doesn’t mean that there aren’t applications that have state, and need to keep that state over multiple pages.
First of all if the cache control is a security concern with your application, you are already in trouble. Any HTTP request can be crafted by hand/tool, you don’t need a browser. On the server side you simply cannot assume that people will follow the path you have imagined they will.
A robust application will recover from any state to “just work”, effectively serving the appropriate response (better than “shame on you”), whether the state is stored in a cookie, the URL, ET tags or other mechanisms. (For the banks part of the answer is that the important things they do will be in an HTTPS session, again a case of state). This can happen with any browser, e.g if it has been offline for some time and then continue. People who use laptops or phones on the move will recognize that situation.
(This happened to me right now in fact, my wireless connection dropped without me noticing. If I couldn’t have gone back in history when I got the error message, I would have had to retype this response.)
These are interesting points that you raise. I can see that it is very useful to have a complete history of your browsing. “Oh hang on what was the name of that thing?” – search history – find it. And I can see that browsers are only going to have more features like this.
Let me pick your brains. What would you do for a page with state? Say, for example, that you have an AJAX-y page where you can edit a page layout with drag and drop of various graphical elements. You don’t want to allow someone to return to the page at an arbitrary point in the past, you want them always to see what is currently saved on the server when they start editing. (I’m leaving aside server-side prevention of saving, it’s the user experience I’m talking about here.)
With Firefox using the back button after saving takes you back to the *original* state of the page. Not the state just before it was saved. But you can ask Firefox not to cache and it is quite good about it.
Perhaps Opera takes you back to the page as it was just before you clicked Save?
“In this scenario hasn’t Opera failed them as a user experience however perfect it’s implementation of the RFC?”
Totally agreed.
I have a similar question, sadly I don’t find the answer yet. My question is “Why must the history list work that way?”
Why can’t the history list and cache be just one? I believe it is the history list requirement of the RFC, not Opera, is flawed, in terms of user experience. What is the point of viewing stale resources?
Jonny Axelsson says: “If I couldn’t have gone back in history when I got the error message, I would have had to retype this response”
The RFC doesn’t require that. The spec states that “Rather, a history mechanism is meant to show exactly what the user saw *at the time when the resource was retrieved*.” When this page is retrieved, the comment fields are empty, no? So Opera failed the RFC this time, which is actually a good thing.
IMHO, the HTTP spec is there for browser and web-application so they can talk to each other, NOT how they should talk to the users. Either the spec has to update itself, or the browsers/applications will override, or ignore part of the spec so they can best serve their users.