Source: Tom's Hardware UK – Keywords: AMD, K10, architecture
Categories: Hardware
A corrected and re-examined memory heirarchy (continued)
With any architecture comprising four cores, the memory hierarchy becomes particularly important. The different cores must be able to effectively communicate data, but at the same time avoid polluting the cache of the other cores, which are working on separate tasks.
The 3 level architecture is the solution to this problem; the four cores have a private space of 512 Kb available individually and a unified space through which they can communicate effectively.
The way this new hierarchy works doesn’t change anything we know about AMD processors since the Thunderbird. The L2 cache is still exclusive, which means that the data of the L1 cache are not duplicated, similarly, the data form the L3 cache is also exclusive.
We often talk about cache victims; when data is about to be deleted from the L1 cache during a conflict, instead of being written in the central memory it is placed in the L2 cache, similarly when data must be deleted from the L2 it is rewritten in the L3 cache.
The L3 cache also includes algorithms designed to make it more effecient for multithreaded applications. When data is reclaimed from the L3 cache, if this data has the chance to be attained by the other cores (if it is about a code or if this data was shared previously) it is placed in the L1 cache. It also remains in the current cache.
If data is only necessary in one core it is placed directly in the L1 cache and taken from the L3. You may also have heard of non-inclusive cache victims; the L3 cache contains “victims” deleted from the L2 cache. This is the opposite of the L2 cache, which only contains data which was once in the L1 cache.
With regards the characteristics of these different cache levels, there is little change for Barcelona; the L1 cache is still associated in a group of 2 blocks and has a latency of 3 cycles. On the other hand, its interface has been enlarged by 256 bytes versus 128 on the K8. The L2 cache is associated by an ensemble of 16 blocks and has a latency of 9 cycles on top of the 3 cycles of the level 1 cache. Finally the L3 cache is associated by an ensemble of 32 blocks. AMD doesn’t give its latency, which makes sense given the fact that it is variable in function of the bandwidth reclaimed by the different cores.
- Previous page A corrected and re-examined memory...
- Next page 2 memory controllers for the price of...
- Feeling the Squeeze: AMD's Athlon 64 X2 6400 Black Edition
- Can CPUs (Finally) Make PCs Faster as Well as Quieter?
- Overclocking to new limits: Testing the new Core 2 Stepping
- Do More Cores Beat More Clock Speed?
- Extreme FSB: Taking the E6750 Beyond 4 GHz
- Tom's Hardware's 2007 CPU Charts
- AMD's Smart Strike: Athlon X2 BE-2350
- Energy Efficiency Duel: Intel Left Out In The Cold
- Which is the Best Mainstream CPU?
- The Gigahertz Battle: How Do Today's CPUs Stack Up?

Why is the wording under the pictures in French?
It looks like the editor either hasn't been doing his/her job properly or is not a fluent English speaker. There are at least half a dozen spelling errors in this article, and the grammar is somewhat less than perfect!
Apart from that, an interesting read.
Re: "AMD K10: The Architecture of the Revival?"
Article compares apples and oranges. :-(
i.e. It would be fair to compare the memory architecture of Coppermine vs. Thunderbird, as an example of where AMD /romped/ ahead.
Go back to Tomshardwares own archives and compare those memory architectures.
Or as another example, compare Katmai with the original Slot-A Athlon K75.
Where's the definitive great chart of all (x86) CPUs gone? Where are the archives?? What happened to the once-great tomshardware.com????
Cheers!
Fragz.