https - How to add resources to firefox cache database manually?

07
2014-07
  • kev

    My ISP blocks websites from time to time. Recently I cannot access https powered sites.
    Firefox cannot load https://ajax.googleapis.com/.../jquery.min.js ...
    But I can still access http://ajax.googleapis.com/.../jquery.min.js ...
    I notice that: If I access the https resource, firefox cannot load from http-cached files.
    Is there some kind of tool/method to maniplate firefox caches.
    I want to add https://.../jquery.min.js to firefox cache database manually.

  • Answers
  • sklr

    You can completely disable cache with Web Developer extension - https://addons.mozilla.org/bg/firefox/addon/web-developer/ ..or you can download whole website for offline use - https://addons.mozilla.org/en-US/firefox/addon/scrapbook/


  • Related Question

    firefox - Caching in modern browsers sucks, why and how to fix?
  • jdm

    One thing I've noticed where all modern browsers fall short is caching. I remember years ago in Internet Explorer 5 - a browser that was the horrors by todays standards! - that I could select "File/Offline Mode" at any time, and then browse everything I visited in the last couple of days from cache. It would even automatically activate offline mode when the connection went down. Also it seemed to use the cache much more aggressively than nowadays even when browsing online. All of this was a neccessity with the modems of the day and their slow and unreliably connections. Nowadays, when I'm travelling with my netbook I could frequently use such a feature, especially when WiFi is flaky or not available.

    Firefox still has an option to "work offline", and it works on a handfull of pages, but it seems very limited. Also there is no straightforward way to see which sites in my history are cached.

    Is there a way to make caching more "agressive" or comprehensive, and offline mode useful again? Maybe extensions, or a certain browser?


  • Related Answers
  • William C

    Earlier versions of squid (2.2 and before) have an "offline_mode" feature.

    This mode turns off cache validation, that is, if the resource is already in the squid cache, squid will not contact the original website to check whether the cached resource is valid/fresh or not.

    Combine offline_mode on and an aggressive catch-all refresh_pattern such as

    refresh_pattern . 10080 9999% 43200 override-expire ignore-reload ignore-no-cache ignore-no-store ignore-must-revalidate ignore-private override-lastmod reload-into-ims store-stale

    and you can go offline for months and still be able to revisit static websites you visited before!

    For more info, read http://www.squid-cache.org/Doc/config/offline_mode/ and http://linuxdevcenter.com/pub/a/linux/2001/08/02/offline_squid.html . Squid runs on most operating systems, so give it a spin.

    I hope this answers your last question.

    Now to answer the "why" on your question title, the web now is not what it was since IE5. The majority of websites will break in offline mode. The web has become more reliant on dynamic live content, i.e., much content now is not designed to be cached for long. Read this question I asked in the Squid Users mailing list.

  • ultrasawblade

    A HTML author can use header and meta tags to instruct a browser to not cache a page.

    This is the trend now, given that HTML and browsers are more or less these days considered to form a general application-level protocol/runtime environment, and not just a static document retrieval protocol.

    It's technically possible to strip/change any unwanted stuff from HTML documents using a proxy server. squid would provide the framework for this capability - including running HTML requests through a script that can modify stuff on the fly - but you'd have to write your own script that modifies tags creating behavior you don't want. Also, messing with Javascript in pages is messy, time consuming, different for each site, and the payoff is usually not worth the effort.

    I don't know of a turnkey solution that provides this.

    Generally I've found it helpful to capture pages by printing them to .PDF or similar instead of relying on the browser cache to remember what I was doing.