networking - Does the internet cache all data?

09
2014-02
  • Friend of Kim

    To clearly show my question I'm going to use pretty high numbers. A file server is connected to the internet through a 1 gbps line. The server is sending a 100 gb file to the client. The file is fragmented into packets and sent with a speed of 1 gbps to the clients ISP. The client, however, is connected to the ISP with a 1 mbps line. This would mean that the ISP would have to save/cache all the data being sent to it from the file server until it is all received by the client.

    Is this how it is done, or does the server somehow send the packets with the same rate as the slowest line between the server and the client?

  • Answers
  • The Spooniest

    The Internet does not use only one protocol. It doesn't even use only one protocol at time: it actually uses several at once, which stack up on top of each other to do all sorts of different things. If we oversimplify things a little, you could say that it uses four.

    • Link Layer: A protocol that lets you push a signal down a wire (or radio waves, or flashes of light, or whatever) to another machine on the other end. Examples include PPP, WiFi, and Ethernet.
    • Network Layer: A protocol that lets you push a signal through a chain of machines, so that you can get data between machines that aren't directly connected. This is where IP and IPv6 live.
    • Transport Layer: A protocol that lets you make some basic sense out of that signal. Some, like TCP, establish "virtual connections" between two machines, as though there were a wire straight between them. Others, like UDP, just blast bits of data out from one machine to another. Different protocols have different strengths and weaknesses, which is part of why there are so many.
    • Application Layer: These are what we typically think of as "protocols". They're specialized for certain types of data, meant for specific purposes. Some examples include FTP, HTTP, and BitTorrent, which all transfer files.

    Those file transfer protocols I mentioned are typically stacked on top of TCP (which is itself stacked on top of IP), which is where we get to your particular question. TCP tries, as best it can, to work like a wire straight between the machines would: when the server sends a packet, it can be sure the client got it, and it can be sure the client got its packets in the same order that the server sent them. Part of the way it does this is that every packet the server sends has to be acknowledged by the client: it sends a small signal back saying "OK, I got that packet you sent; I'm ready for the next one." If the server doesn't get that acknowledgment, it keeps sending packets until it does (or decides that this is never going to work and gives up).

    This is the key to answering your question. The server can't send Packet 2 until it knows that Packet 1 got through, which can't happen until Packet 1 gets acknowledged, and that can't happen until Packet 1 is really finished. The servers in the middle of the chain don't have to cache any data (no more than one packet at a time, anyway) because by the time a machine even sees Packet 2, it knows that it doesn't need Packet 1 anymore.

    One last point: technically, this only means that the Internet doesn't have to cache data the way you're talking about. If someone really wanted to cache all of this data, they could; there's nothing in the protocols to really stop it from happening. But the Internet doesn't need these caches in order to work.

  • JSanchez

    Most servers (Web, FTP, what have you) use bandwidth throttling to avoid having one computer hog all the available bandwidth. Each connection may be limited to a certain speed, so that multiple clients can connect and not be affected by slow downs. At least, too much. Remember, your connection is limited by the slowest link on your chain.


  • Related Question

    internet - Why does my webpage look different when I connect using different routers?! Does routers cache files?
  • Ayyash

    Here is the case, I am working on a site from office and home, I recently updated the stylesheets and logged in the live site from office (using my same laptop I use all the time), and everything looks okay, I come home use my home internet connection to connect to the site using the SAME laptop, the styles are not updated!

    The thing is: this happens on ALL browsers, and after emptying the cache many times, and even after one month of work, and even if I have never opened the site before on that browser (as if my router has a cache of its own)

    Another thing: only one particular styles.css file seem to be hanging

    Extra info: I use the same IP for my home wireless router as that defined in the office, the usual 192.168.0.1


  • Related Answers
  • DRG
    1. Who is hosting the website? your work or a 3rd party (squarespace, godaddy)
    2. If it's hosted by your work, are you uploading your content to a production server? Some businesses have a test server where sites can be tested out before they go live on the internet.
    3. If it's a 3rd party... it could be a caching issue with your ISP... ISP's usually update their cache very frequently but sometimes things get stuck.
    4. Can you edit / upload changes to the page? Try doing so and see if things change at home.
  • Dave Sherohman

    Assuming you've already checked to verify that your request is going to the same server regardless of where you are[1], the other major possibility is that the server may be configured to display different versions of the page for "internal" vs. "external" visitors, based on the IP address that the request comes from[2].

    The most likely solution for either of these would be to set up a VPN to connect you into the office network so that the web server will see your requests as coming from an "internal" IP address.

    [1] Checking that the hostname resolves to the same IP address from both locations is a decent, but not foolproof, way of doing this.

    [2] This would be the router's internet-facing address, not the 192.168.x.y address.