memory - Slower cpu (pstate) = slower ram performance?

08
2014-07
  • agz

    I noticed that if I lower my CPU clock speed via P-states/SpeedStep, my RAM slows down. However, through CPU-Z, my HT link, which is what connects my CPU to the memory, is still running at the same clock speed. What causes the RAM speed to slow down?

    Here's what I did:

    • I used the AMD catalyst utility to underclock the cpu to 800mhz.
    • I ran Geekbench.
    • The Geekbench score for memory went significantly down.
  • Answers
  • Marcus Chan

    From the Geekbench 2 benchmark description page:

    Memory benchmarks measure not only the performance of the underlying memory hardware, but also the performance of the functions provided by the operating system used to manipulate memory.

    • Read Sequential loads values from memory into registers.
    • Write Sequential stores values from registers into memory.
    • Stdlib Allocate allocates and deallocates blocks of memory of varying sizes using functions from the C Standard Library.
    • Stdlib Write writes a constant value to a block of memory using functions from the C Standard Library.
    • Stdlib Copy copies values from one block of memory to another using functions from the C Standard Library.

    I'm guessing that because Geekbench is (in some parts) testing how quickly it can load data from CPU registers into RAM, that test would (of course) be slower with a reduced CPU clock. And, of course, as harrymc mentioned, reduced CPU speed would make the whole test run slower.

    In general, score-based benchmarks like Geekbench tell you very little about how your system is running. There's very little way to isolate "memory performance" as a separate entity, since there are so many variables and most of them are directly related to the performance of the rest of your system.


  • Related Question

    memory - How does the CPU write infomation to ram?
  • Questioner

    My question is, how does the CPU write data to ram?

    From what I understand, modern CPU's use different levels of cache to speed up ram access. The RAM gets a command for information and then sends a burst of data to the CPU which stores the required data (and a bunch of extra data that was close to the address the CPU wanted) into the highest level cache, the CPU then progressively asks the different caches to send smaller and smaller chunks of data down the levels of caches until it is in the level 1 cache which then gets read directly into a CPU register.

    How does this process work when the CPU writes to memory? Does the computer go backwards down the levels of cache (in reverse order as compared to read)? If so, what about synchronizing the information in the different caches with the main memory? Also, how is the speed of a write operation compared to a read one? What happens if I'm continuously writing to RAM, such as in the case of a bucket sort?

    Thanks in advance,

    -Faken

    Edit: I still haven't really gotten an answer which I can fully accept. I want to know especially about the synchronization part of the RAM write. I know that we write to the L1 cache directly from CPU and that data gets pushed down the cache levels as we synchronize the different levels of caches and eventually the main RAM gets synchronized with the highest tier cache. However, what i would like to know is WHEN do caches synchronize and scynocronize with main RAM and how fast are their speeds in relation to read commands.


  • Related Answers
  • Skizz

    Ah, this is one of those simple questions that have really complex answers. The simple answer is, well, it depends on how the write was done and what sort of caching there is. Here's a useful primer on how caches work.

    CPUs can write data in various ways. Without any caching, the data is stored in memory straightaway and the CPU waits for the write to complete. With caching, the CPU usually stores data in program order, i.e. if the program writes to address A then address B then the memory A will be written to before memory B, regardless of the caching. The caching only affects when the physical memory is updated, and this depends on the type of caching used (see the above link). Some CPUs can also store data non-temporally, that is, the writes can be re-ordered to make the most of memory bandwidth. So, writing to A, then B, then (A+1) could be reorderd to writing to A then A+1 in a single burst, then B.

    Another complication is when more than one CPU is present. Depending on the way the system is designed, writes by one CPU won't be seen by other CPUs because the data is still in the first CPUs cache (the cache is dirty). In multiple CPU systems, making each CPU's cache match what is in physical memory is termed cache consistancy. There are various ways this can be acheived.

    Of course, the above is geared towards Pentium processors. Other processors can do things in other ways. Take, for example, the PS3's Cell processor. The basic architecture of a Cell CPU is one PowerPC core with several Cell cores (on the PS3 there are eight cells one of which is always disabled to improve yields). Each cell has its own local memory, sort of an L1 cache which is never written to system RAM. Data can be transferred between this local RAM and system RAM using DMA (Direct Memory Access) transfers. The cell can access system RAM and the RAM of other cells using what appears to be normal reads and writes but this just triggers a DMA transfer (so it's slow and really should be avoided). The idea behind this system is that the game is not just one program, but many smaller programs that combine to do the same thing (if you know *nix then it's like piping command line programs to achieve more complex tasks).

    To sum up, writing to RAM used to be really simple in the days when CPU speed matched RAM speed, but as CPU speed increased and caches were introduced, the process became more complex with many different methods.

    Skizz

  • Am1rr3zA

    yes it's go backwards down the levels of cache and save to memory but the important note is in Multi Processing system the cache are shared between 2 or more processor(core) and the data must be consistent this was done by make shared cache for all multiprocessor or different cache but save consistency by use of Critical section (if data in one cache changed it force it to write in memory and update other cache)