计算机选读(960)
Over the years, the clock frequency became the key measure of processor performance. In parallel with Moore's Law, which predicts that the number of transistors on a chip increases exponentially, the clock frequency has done the same, doubling roughly every 18 months from thousands of ticks per second in 1977, to millions in the 1980s, to billions today. But while optimists believe that this process will continue, chip developers across the industry now agree that clock frequency will no longer be the key metric of processor performance, for several reasons.
The first is the growth of parallelism——the practice of getting a chip to execute many different operations simultaneously. In the past, this was confined to the realm of high-end supercomputers, as a way of improving their performance. But it is now becoming common in personal computers, and is bound to become more so.
A driving factor behind this parallelism is the fact that, while processor speed has increased with such remarkable rapidity, the speed of memories has lagged. What's more, the gap between processor speed and memory speed is likely to grow. Parallelism within a single chip allows several different processing units to share the same memory, so the memory's slowness is not such a problem.
This is because the limiting factor is not so much the throughput of memory chips (the rate at which data can be moved in and out of them) but the administrative overhead associated with moving information in and out of the processor. Because of this, chip designers can gain by putting several distinct processors on the same chip, and have them share a fast, local memory inside the chip itself. This approach is known as multiple cores, or multi-core for short. A related approach is known as simultaneous multi-threading. It involves modifying a single processor to enable it to switch quickly between several distinct tasks. While one task is waiting for data to arrive from the main memory, another can continue to execute——so a single processor can in effect, do the work of many.
A second reason why clock frequency will no longer be an accurate measure of performance is that distributing the clock's signal to all the different parts of a chip is more difficult that it sounds. Reducing the “skew” on a chip ——the amount by which clock signals might be out of synch——takes a very skillful chip designer. It is becoming more difficult as chips get larger and more complex.
That's why “asynchronous” technology is exploring aggressively, which involves getting rid of the clock entirely. This approach has costs and benefits, since miniature circuits known as “rendezvous circuits” must be placed at circuit junctions to co-ordinate the flow of data. It is rather like replacing a city-wide network of traffic lights with policemen at every corner. In one recent experiment with a test chip that could run in both synchronous and asynchronous modes, the asynchronous mode won out. That's because in a synchronous design, every operation must wait for the slowest one to complete, while in an asynchronous one, a laggard only delays the local part of calculation.
Clockless chips also have the added benefit of emitting for less radio interference. So asynchronous circuits could be particularly useful in devices such as mobile phones, where radio interference is a substantial concern.
Finally, getting chips to run at higher clock frequency is diminishing in importance because another problem is becoming more pressing: getting them to consume less power.. Power consumption is now the biggest problem in chip design.
速度不是一切
多年来,时钟频率一直是处理器性能的主要测量指标。与预测芯片上晶体管数目指数地急剧增加的摩尔定律相辅相成,时钟频率也是做同样的事,每18个月翻一番,从1977年的每秒几千次,增加到上个世纪80年代的几百万次,到目前的几十亿次。虽然乐观主义者还认为这个过程将继续,但是全行业的芯片开发人员都同意,时钟频率因多种原因将不再是处理器性能的主要指标。
首先是并行处理的发展——让芯片同时执行很多不同操作的做法。过去,并行处理仅限于高端的巨型机,作为提高性能的方法。但现在,它在个人计算机中也已常见,而且会越来越多。
并行处理背后的驱动因素是这样一个事实,当处理器的速度快速提高的同时,存储器的速度却落后了。更有甚者,处理器速度与存储器速度之间的差距有可能拉大。单一芯片中的并行性让几个不同的处理器共享同一存储器,从而存储器的缓慢不再是一个问题。
这是因为很大程度上限制因素不是存储器芯片的吞吐能力(数据进出存储器的速率),而是与信息出入处理器相关联的管理开销。正由于此原因,芯片设计师能做到在同一芯片中放入多个处理器并共享该芯片内的快速本地存储器。该方法叫做多内核。另一个相关的方法叫同时多线程。它涉及到改进单一处理器,使之能在几个不同的任务之间快速转换。当一个任务等待主存中的数据送来之时,另一个任务能继续执行——从而单个处理器实际上能做很多工作。
时钟频率不再是性能的精确测量指标的第二个原因是,将时钟信号分配到芯片的不同部分,要比说说困难得多。减少芯片上的“偏差”——时钟信号失去同步的程度,需要技术高超的芯片设计师。随着芯片越来越大、越来越复杂,这个问题也变得更加困难。
这就是为什么“异步”技术得到大力研究开发,该技术涉及到将时钟彻底去除。此方法既有得也有失,因为必须在电路交接点放置称之为“聚集电路”的微型化芯片,以协调数据的流动。这相当于在每个路口用警察代替整个城市的交通信号灯网络。在最近的一次对测试芯片(在同步和异步方式下都能运行)进行的实验中,异步方式胜出。这是因为在同步设计中每个操作必须等待最慢操作的完成,而在异步方式下迟缓的操作只是延缓局部的计算。
无时钟的芯片还有一个额外的优点,即辐射更少的射频干扰。因而异步电路特别适合用于移动电话等对射频干扰非常关注的设备。
最后,让芯片运行在更高时钟频率的重要性也在减小,因为另一个问题变得更为迫切:让(芯片)消耗更少的电能。能耗现已是芯片设计中最大的问题。