INTEL PENTIUM PROCESSOR
In June 1989, Vinod Dham made the first outline of the processor, codenamed P5. Vinod Dham is widely known in the West as the Father of a Pentium chip. In late 1991, the development of the processor layout was completed, and engineers were able to run the software on it. The stage of optimizing topology and increasing the efficiency of work has begun. In February 1992, the design was basically completed, a comprehensive testing of the experimental batch of processors began. In April 1992, it was decided to start industrial production, as the main industrial base, the Oregon factory No. 5 was chosen. Industrial development of production and final refinement of technical characteristics began.
In October 1992, Intel announced that the fifth-generation processors, formerly codenamed P5, will be called the Pentium, and not 586, as many expected. This was due to the fact that many processors producing processors actively mastered the production of "clones" (and not only) processors 386 and 486. Intel was going to register the name "586" as a trademark so that no one else could manufacture processors with such name, but it turned out that it is impossible to register the numbers as a trademark, so it was decided to name the new Pentium processors (based on the other Greek-based "πέντε" five), which also indicated the generation of this processor. On March 22, 1993, a new microprocessor was introduced, and a few months later, the first Pentium-based computers appeared.
Pentium MMX - Intel's processor, released January 8, 1997 based on the P5 core of the third generation (P55C). Intel's Center for Development and Research in Haifa (Israel) added a new set of instructions to the P55C core, called MMX (MultiMedia eXtension), which significantly increases (from 10 to 60%, depending on optimization) the performance of the computer in multimedia applications. These processors are called Pentium w / MMX technology (usually reduced to Pentium MMX).
The processor includes a MMX device with pipelined commands, the L1 cache is increased to 32 KB (16 KB for data and 16 KB for instructions). Contains 57 new commands for parallel processing of integer data, a data type of 64 bits is entered. To improve performance, the instruction cache and data cache have been increased to 16 KB each. Models with clock frequencies of 166 MHz, 200 MHz and 233 MHz were available [1].
The processor consists of 4.5 million transistors and was manufactured using advanced 350 nanometer technology using silicon CMOS semiconductors and
operated at a reduced voltage of 2.8 V. The maximum current consumption is 6.5 A, the heat dissipation is 17 W (for Pentium 233 MMX) . The crystal area of the Pentium MMX processors is 141 mm2. The processors were manufactured in a 296-pin package such as CPGA or PPGA for Socket 7.
Processors based on this core, intended for portable computers, were used in the so-called. "Mobile module" MMC-1 Mobile Module Connector with 280 pins worked together with the Intel 430 TX chipset and at the same time had 512 KB of cache on the motherboard. The Tillamook core (named after the city in Oregon, USA) is a P55C core with a reduced supply voltage - the 300 MHz model worked at 2.0 V, consuming 4.5 A current and had a heat release of 8 , 4 watts. The older models (with a frequency of 233, 266 and 300 MHz) were produced using a 250-nm process technology and had a crystal with an area of 90 mm², there were also versions with a 166 MHz core frequency. Model 200 and 233 were produced in August 1997, model 266 from January 1998 , and the oldest model in the lineup was introduced in January 1999.
Superscalar architecture. Due to the use of superscalar architecture, the processor can execute 2 instructions per 1 cycle. This possibility exists due to the presence of two pipelines - U and V. U-pipeline - the main one, performs all operations on integers and real numbers; V-pipeline - auxiliary, performs only simple operations on integers and partially over real ones. For old programs (for 486) to take full advantage of the capabilities of such an architecture, it was necessary to recompile them. Pentium - the first CISC-processor, using a multiconveyor architecture.
The 64-bit data bus allows the Pentium processor to exchange twice the amount of data with RAM for one bus cycle, than 486 (at the same clock speed).
The branch address prediction mechanism. It is used to reduce the downtime of pipelines caused by delays in the selection of commands when the address counter changes while executing branch commands. To do this, the processor uses the Branch Target Buffer (BTB), which uses branching address prediction algorithms.
Separate caching of the program code and data, reducing the number of cache misses when selecting instructions and operands compared to 80486. Pentium processors use the first level cache (L1 cache) of 16 KB in size, divided into 2 segments: 8 KB for data and 8 KB for instructions. To reduce the access time and reduce the cost of implementation, both segments are 2-channel multiple-associative, in contrast to the 4-channel cache 80486.
Improved floating point unit (FPU). Some instructions were speeded up by an order of magnitude, for example FMUL, the speed of which increased by 15 times. The processor can also execute the FXCH ST instruction (x) in parallel with the usual instructions (arithmetic or loading / unloading registers).
Four-input address adders. Allows you to reduce the latency of address calculation compared to 80486. In a Pentium processor, you can calculate an effective address in a single clock cycle when using base-index addressing with scaling and offset. 80486 has a three-input address adder, therefore in it calculation of such address takes two measures.
The microcode can use both pipelines, so instructions with a repetition prefix, such as REP MOVSW, perform one iteration per clock, while 80486 requires three cycles per iteration.
A faster fully hardware multiplier reduces (and makes more predictable) the execution time of MUL and IMUL instructions in comparison to 80486. The execution time decreases from 13-42 clocks to 10-11 for 32-bit operands.
Virtualization of interrupts, which allows to speed up the virtual 8086 mode.