Comparison tests of 1Ghz processors
(continued)

Monday, September 18, 2000


The AMD Athlon "Thunderbird" 1GHz

At the inception of the AMD Athlon, the principal characteristic that distinguished it from Intel's offerings was that Intel's CPUs integrated their L2 caches into the processor die, while AMD's did not. So, while the Pentium III "Coppermine"'s L2 cache was free to operate at the same clock rate as the processor core, the Athlon's L2 cache was relatively hampered by being located outside the body of the CPU. In the best of cases, the "classic" Athlon's L2 cache operated at only 350MHz, and that situation was unique to the 700MHz Athlon. The reason for this is quite simply that an external cache is made up of SRAM, which is quite costly. To save money, manufacturers are often forced to use memory that operates quite a bit slower than the CPU it is meant to work with. So, from the point of view of a CPU manufacturer, the most economical situation would be to integrate the cache into the body of the CPU itself, thus saving money, while simultaneously allowing the cache to operate at the same speed as the CPU core. Still, there is only so much real-estate available in a single computer chip, and for that reason the 512KB L2 caches of the "classic" Athlons were reduced to 256KB in order to be integrated into the new generation - the AMD Athlon "Thunderbird", and Duron.

In short, AMD's "Thunderbird" and Duron processors are the first to use an internal L2 cache; though the implementation itself is slightly different from that used by Intel.

Much like the "classic" 1GHz Athlon, the Athlon "Thunderbird" 1GHz uses the K75 core, and is manufactured using a 0.18 micron process. Right next to the core, we also find 256KB of full-speed L2 cache, in addition to the 128KB of full-speed L1 cache. By integrating the L2 cache into the body of the CPU, AMD has managed to reduce the L2's memory latency from the "classic" Athlon's 21 cycles, to the "Thunderbird"'s 11 cycles.

There are, of course, a few differences between the "Thunderbird's" L2 cache, and the cache implementation adopted by Intel with their "Coppermine" processors. The essential ones can be summed up thusly:

In its PR, AMD refers to its cache as being 384KB in size; an addition of the 256KB of L2, with the 128KB of L1. The reason for this is quite simple: AMD uses an "exclusive" cache architecture, while Intel's cache is "inclusive".

In an "inclusive" cache, all of the data that is stored within the L1 cache, must be duplicated within the L2 cache. Thus, if your "Coppermine" has 32KB of information stored in its L1 cache, that same 32KB must be duplicated in the 256KB of L2 cache living only 224k of L2 cache to be available for other data storage. With its "exclusive" cache, the 128KB of L1 cache in the Athlon "Thunderbird" does not need to be duplicated in the L2 cache, as the latter is designed to contain information destined to be sent back to the main memory system. So, the "Thunderbird" can really be said to contain upto 384KB of internally cached information.

That said, the "Thunderbird's" L2 cache still uses a 64-bit bus, much like the "classic" Athlons. This is quite in contrast to the "CuMine's" L2, which uses a 256-bit bus, which essentially means that the "Thunderbird's" L2 has only 25% the bandwidth of the "Coppermine".

On the other hand, the "Thunderbird's" L2 cache does benefit from being 16-way associative, compared to the "CuMine's" 8-way associativity. This one feature results in an overall performance boost for "Thunderbird", as it improves the speed at which instructions can be retrieved for the CPU's use.

Finally, the Athlon "Thunderbird" has been offered in two distinct packaging formats. The first is the Socket 462 format more commonly known as Socket A. The other is the Slot A format, which was offered solely to manufacturers. As far as do-it-yourself types are concerned, then, there is only one type of "T-Bird" worth mentioning: the Socket A type. While similar to the Socket 370 format that is used by some Intel CPUs, AMD's Socket A interface sports a few less interconnects, and - of course - is designed with the Alpha's EV6, 100MHz DDR bus in mind.

Next: The tests