Throughout the 1990's and into the early part of the 2000's single
core CPUs dominated. Compiler optimizations, dropping memory costs, consistent feature (transistor) size reductions and pipeline optimizations yielded seemingly endless increases in performance. However, by around 2002 something dropped off and the endless performance increases of the 1990's came to an end.
The something that ended the performance increases of the 1990's were really a combination of many things. Power consumption was becoming a real concern and pipeline optimizations were reaching a point of diminishing return. The advancements that by 2004 had propelled processors to 3.2GHz clock speeds and extremely long instruction pipelines were no longer a viable solution for consistent performance increases, increases that were necessary to keep up with Moore's Law (link).
The other major problem was the size of the copper wire. Since computer chips are made primarily of copper and silicon, the feature size of a computer chip is limited by the size of the copper atom. Furthermore, as wires get smaller they have more resistance and therefore generate more heat (i.e. more electrical energy is converted to heat). We have yet to reach the point where feature sizes on processors are as small as they can possibly go. But we are close. Some researchers estimate that 10nm is the limit. Many consumer desktop and notebook computer processors are currently at 45nm.
The solution was to find a new way to optimize. Enter thread level parallelism (TLP). A thread is a single unit or stream of execution. Only one thread can execute at a time on a single core CPU. However, to optimize the use of the processor (and to fool you into thinking multiple things can run at a time) many threads may be swapped on and off the processor within a short period of time. This is sometimes referred to as time division multiplexing. Multicore processors allow one thread to run on each core. Since each core executes independently (in terms of data and instructions), the degree of thread level parallelism (TLP) is effectively increased with the number of cores.
By using multiple cores, chip makers are betting that TLP can be effectively used where (1) all the cores can actually be utilized and (2) scaling up cores and thereby scaling up TLP will continue to produce better performing computers. At the same time chip makers are attempting to address power and size scaling concerns. They are doing this by putting lower power cores in multicore processors. And as it turns out, two lower power cores can (and frequently do) outperform a single higher power core. TLP to the rescue... at least for now.