Neoseeker : Articles : CPU : Socket 1366 : Intel Core i7 920 940 965 Review & Overclocking
Hardware Newsletter:
Email:

News Headlines
New Articles

Compare Prices

Motherboards
Abit
ASUS
Gigabyte
MSI
eVGA
Intel
Tyan
More...

Processors
AMD
Intel
More...

Memory
DDR
DDR2
DDR3
More...

Video Cards
ATI
eVGA
XFX
BFG
Sapphire
More...

search for lowest prices

send article   hardware newsletter   article comments (10)
Intel Core i7 920 940 965 Review & Overclocking - PAGE 2
William Henning - Sunday, November 2nd, 2008

Core i7 Architecture

In order to optimize performance - and reduce power consumption - Nehalem takes power management to new heights for Intel. Not only can it run on just one underclocked core when the computer is not loaded, it can automatically overclock from one to four cores as the performance is needed! Intel hopes to use the same Nehalem design to address the needs of servers, mobile computing, and desktop/workstation computers.

You read right, Core i7 chips automatically overclock themselves to a certain degree when needed. Mind you, the overclock is not much at this time - just an increase by one step of the multiplier when all four cores are busy - however it can potentially overclock fewer cores even higher, as long as the processor as a whole stays within its rated TDP envelope.

The Core i7 keeps all of the performance features of the Core 2:

  • Wide Dynamic Execution - 4 wide decode/rename/retire
  • Advanced Digital Media Boost - which consists of 128 bit SSE instructions executed in one cycle
  • Intel HD Boost - the new SSE4.1 instructions
  • Smart Memory Access - consisting of memory disambiguations and hardware prefetching
  • Advanced Smart Cache - low latency high bandwidth shared L2 cache

 and adds new advances of its own:

  • new SSE4.2 instructions adding string handling and CRC32 calculations
  • improved locking support
  • an additional cache hiearchy
  • improved looping and streaming support
  • better branch prediction
  • improved virtualization support with faster transitions into / out of virtual machines
  • simultaneous mult-threading - keeping the work units busy and improving performance
  • new TLB hierarchy - adds 512 small page 2nd level TLB
  • fast 16 byte unaligned access - basically eliminating the unaligned access speed penalty
  • faster synchronization primitives - improving multi-threaded performance

Each core has 32KB of instruction cache, 32KB of data cache, and a private unified 256KB L2 cache - and the four cores share a massive 8MB of L3 cache.

At the "front end" of a core, the 4 instruction wide decored is followed by a "macro fusion" unit and a loop stream detector.

The Macro fusion unit can combine  TEST/CMP instruction followed by a branch into a single operation, thus improving throughput and effectively executing more instructions per unit time.

The loop stream detector allows the disabling of unneeded gates as there is no need to keep fetching and decoding the same instructions repeatedly, and also there is no need to predict the branches - this leads to higher performance and lower power consumption. Nehalem also improves on branch prediction by going to a multi-level approach,

At the "Execution Unit", Nehalem uses a "unified reservation station" to schedule work among the six potential execution units - and it can potentially execute six operations per clock cycle:

  • One Load from memory
  • One Store Address
  • One Store Data
  • Three "Computational Operations" such as math, logic or branch operations

While Penryn (Core 2) already has a similar Reservation Station scheme, Nehalem significantly improves on it.

Due to the addition of SMT, Nehalem has 36 reservation stations instead of 32 for Penryn; Penryn had 32 load buffers, Nehalem increases that number to 48, and Nehalem also increases store buffers to 32 from Penryns 20.

These changes also help Simultaneous Multi Threading keep more of the execution engines occupied that would be left idle, and by keeping idle units busy, increasing performance instead of wasting power.


Article Index

1.Introduction
2.Core i7 Architecture
3.Core i7 Architecture - Continued
4.Core i7 920 - the value i7
5.Core i7 940 - mid range part
6.Core i7 965 Extreme - high end part
7.X58 & ICH10 North & South bridge for the Core i7
8.Thermalright Socket 1366 Cooler
9.Intel XM25 SSD & Quimonda DDR3
10.Intel Extreme Motherboard DX58SO
11.The BIOS
12.More BIOS
13.Test Setup & Benchmarks
14.Business Winstone & Content Creation
15.WinRAR, HDTach & HDTune
16.Sandra CPU & MMM
17.Sandra Bandwidth & Latency
18.RightMark Read & Write
19.RightMark Bandwidth & Latency
20.Lame & TMPGEnc
21.CineBench & POV-Ray
22.Doom 3 & Quake 4
23.UT2003, Halo & Jedi Knight
24.Commanche 4 & Call of Duty
25.World In Conflict
26.Crysis
27.Devil May Cry 4
28.Dynasty Warriors 6 Benchmark
29.Overclocking
30.Power Consumption
31.Conclusion

Submit our article to: diggDigg this! de.le.ciousdel.icio.us

Get updates when we publish new articles
Email Address:
(0.0452/d/nova)