memory - Which files are cached by linux?

06
2014-04
  • Max Beikirch

    I understand that linux uses unused RAM to keep some files cached. But I wonder, which files it actually caches! If you take a look at free :

    Gesamt Belegt Frei Gemeinsam Puffer Cached
    Speicher:       5,8G       3,7G       2,1G         0B       259M       1,7G
    -/+ Puffer/Cache:       1,8G       4,0G
    Auslagerungsdatei:       4,0G         0B       4,0G
    

    it says that 1.7 GB of my RAM are used as a cache. I know that this is not harmful behavior, but I am curious about which files Linux considers to be cache-worthy. Is there a command/a file that shows me the cached files?

  • Answers
  • yjwong

    If I'm not wrong, the Linux kernel caches specific pages of a file, i.e. not the entire file is loaded into the page cache. One tool which you can use to figure out whether some contents of a file is in the page cache is fincore from the linux-ftools project. While it doesn't display all the cached files on disk, it gives you a rough idea of what is loaded into the page cache.

    An example (quoted from the project's website):

    root@xxxxxx:/var/lib/mysql/blogindex# fincore --pages=false --summarize --only-cached * 
    stats for CLUSTER_LOG_2010_05_21.MYI: file size=93840384 , total pages=22910 , cached pages=1 , cached size=4096, cached perc=0.004365 
    stats for CLUSTER_LOG_2010_05_22.MYI: file size=417792 , total pages=102 , cached pages=1 , cached size=4096, cached perc=0.980392 
    stats for CLUSTER_LOG_2010_05_23.MYI: file size=826368 , total pages=201 , cached pages=1 , cached size=4096, cached perc=0.497512 
    stats for CLUSTER_LOG_2010_05_24.MYI: file size=192512 , total pages=47 , cached pages=1 , cached size=4096, cached perc=2.127660 
    stats for CLUSTER_LOG_2010_06_03.MYI: file size=345088 , total pages=84 , cached pages=43 , cached size=176128, cached perc=51.190476 
    stats for CLUSTER_LOG_2010_06_04.MYD: file size=1478552 , total pages=360 , cached pages=97 , cached size=397312, cached perc=26.944444 
    stats for CLUSTER_LOG_2010_06_04.MYI: file size=205824 , total pages=50 , cached pages=29 , cached size=118784, cached perc=58.000000 
    stats for COMMENT_CONTENT_2010_06_03.MYI: file size=100051968 , total pages=24426 , cached pages=10253 , cached size=41996288, cached perc=41.975764 
    stats for COMMENT_CONTENT_2010_06_04.MYD: file size=716369644 , total pages=174894 , cached pages=79821 , cached size=326946816, cached perc=45.639645 
    stats for COMMENT_CONTENT_2010_06_04.MYI: file size=56832000 , total pages=13875 , cached pages=5365 , cached size=21975040, cached perc=38.666667 
    stats for FEED_CONTENT_2010_06_03.MYI: file size=1001518080 , total pages=244511 , cached pages=98975 , cached size=405401600, cached perc=40.478751 
    stats for FEED_CONTENT_2010_06_04.MYD: file size=9206385684 , total pages=2247652 , cached pages=1018661 , cached size=4172435456, cached perc=45.321117 
    stats for FEED_CONTENT_2010_06_04.MYI: file size=638005248 , total pages=155763 , cached pages=52912 , cached size=216727552, cached perc=33.969556 
    stats for FEED_CONTENT_2010_06_04.frm: file size=9840 , total pages=2 , cached pages=3 , cached size=12288, cached perc=150.000000 
    stats for PERMALINK_CONTENT_2010_06_03.MYI: file size=1035290624 , total pages=252756 , cached pages=108563 , cached size=444674048, cached perc=42.951700 
    stats for PERMALINK_CONTENT_2010_06_04.MYD: file size=55619712720 , total pages=13579031 , cached pages=6590322 , cached size=26993958912, cached perc=48.533080 
    stats for PERMALINK_CONTENT_2010_06_04.MYI: file size=659397632 , total pages=160985 , cached pages=54304 , cached size=222429184, cached perc=33.732335 
    stats for PERMALINK_CONTENT_2010_06_04.frm: file size=10156 , total pages=2 , cached pages=3 , cached size=12288, cached perc=150.000000 
    ---
    total cached size: 32847278080
    

    The command above lists some of the *.MYD, *.MYI and *.frm files that have some pages stored in the page cache.

    If you really wanted to find out all files with at least 1 page in the page cache, this may work (untested, will produce large output and will probably take very long to run):

    cd /
    find . -type f | xargs fincore --pages=false --summarize --only-cached
    

  • Related Question

    memory - How can I keep a file in Windows 7's cache?
  • netvope

    Sometimes you know better than Windows what files will be re-used later. Suppose you have 8GB of memory, and you use the same 1GB file every hour in an I/O-bound application (which takes 1 second to finish if the file is cached, and 1 minute if not.) Now you process some other 16GB of data that are not going to be re-used. Naturally the frequently used 1GB file will be pushed out of the cache. It would be beneficial if one can tell Windows to keep that 1GB file in memory. (Better yet, it would be great if one can tell Windows not to cache those 16GB of data, but I'm not optimistic that this can be done.)

    The situation is worse for files in network shares: Windows removes files form the cache even if there is free memory. If you immediately re-use a file, it will still be in the cache; but if you close the file and wait for 30 seconds, the cache is gone, and the system will need to re-fetch the file from the remote server. For me this is very noticeable because I'm on a 3 Mbps network link and I work with files that are about 10 MB in size.

    The poor-man's way to keep a file in the cache would be to keep reading the file. Are there any better ways? Are you aware of any programs that do this?

    (If this can be easily done under Linux, please let me know too.)


  • Related Answers
  • Nifle

    Two things comes to mind.

    1. Copy the file to a RamDisk before first use and move it back to file when you are done with it. (QSoft’s RamDisk for $12 was recommended here)
    2. Buy a fast SSD drive and see if that helps (enough)