Phillipharryt 6 years ago

From the article "The heat generation per unit area of an integrated circuit passed the surface of a 100-watt light bulb in the mid 1990s, and now is somewhere between the inside of a nuclear reactor and the surface of a star. "

I can't tell if this is hyperbole or not, it amazes me but no amount of googling is coming up with a useful answer. Is anyone able to confirm or deny it for me?

  • andlier 6 years ago

    After a quick napkin calculation+google it seems the sun has around 20kW of power per square centimeter. So not entirely unfeasible that a cooler star, or a nuclear reactor is closer to the typical 10-100W/cm2 of a modern cpu/gpu. Still some orders of magnitude off from our closest star. (Hope calculation is correct)

    • perl4ever 6 years ago

      Looking at the wikipedia page on red dwarf stars[1], it appears a small star (M9V) might have 8% of the sun's radius and 0.015% of its luminosity. Thus, it would have about 156 times less area, and 2.3% of the output per area of the sun. So taking your figure as given, that means it would be less than 500W / cm^2.

      [1]https://en.wikipedia.org/wiki/Red_dwarf

    • Cybiote 6 years ago

      The temperature of the sun at its surface is ~5772 Kelvin. To get power per unit area, use Stefan-Boltzman: σ * T^4, σ * (5772 K)^4 ≈ 6294 W/cm^2. Dividing the sun's luminosity (power) by its surface area will also give a similar value.

      A 815 mm^2, 250 W GPU will be 250 W / 8.15 cm^2 ≈ 31 W / cm^2.

      • frozenport 6 years ago

        Doesn't that only include radiated heat? If I touched it, it would be hotter.

        • saagarjha 6 years ago

          It wouldn't be hotter (the temperature wouldn't change), but it would transfer more heat to you.

  • ars 6 years ago

    From:

    http://www.wolframalpha.com/input/?i=Sun+%7C+luminosity+%2F+...

    The sun is: 6300 W/cm^2

    > inside of a nuclear reactor

    Inside means volume? That's not the same unit, so I'm not really sure how to calculate that.

    Interestingly, on a per-volume bases, the sun's output is quite low. The sun is just very very large.

    See: http://www.wolframalpha.com/input/?i=Sun+%7C+luminosity+%2F+...

    And you find the sun is only: 1.383×10^-6 horsepowers per gallon :) Or if you insist: 0.2725 W/m^3, which as you can see is really really low.

  • zaarn 6 years ago

    Intel's latest CPU innovation, the i9 "Industrial Cooling Required" Edition with 28 cores, would come close to a nuclear reactor and does manage to replicate the heat output usually experienced during atmospheric reentry from orbit.

    I doubt it comes close to the surface of the average star but certainly close to some of the cooler red dwarf stars and probably most brown dwarf stars.

  • a_wild_dandan 6 years ago

    (Intensity) = (Power) / (Unit Area)

    (Sun's Radius) = 695 x 10^6 m

    (Sun's Power) = 4 x 10^26 W

    (CPU Output) = 75 W

    (Die Size) = 37 mm x 37 mm

    (Sun Intensity) = (4 x 10^26 W ) / (4 x pi x (695 x 10^6 m)^2) ~ 66 x 10^6

    (CPU Intensity) = (75 W) / (0.0014 m^2) ~ 53 x 10^3

    I'm getting a several orders of magnitude higher W/m output for the sun. Perhaps I made an algebra mistake?

    • Tuna-Fish 6 years ago

      Your CPU is huge. For example, the highest-performance single-chip Pinnacle ridge (AMD Ryzen 2 2700X) is 213 mm² and has a TDP of 105W. That's roughly an order of magnitude difference.

    • namibj 6 years ago

      Yes. You can handle 100W on a 300mm^2 chip. Or 300W on a large one, compare AMD Ryzen/ Nvidia Volta (MXM/V100).

      • nine_k 6 years ago

        this gives 3.46e+5, probably comparable to brown dwarf figures in the tread.

        • namibj 6 years ago

          Sounds higher than a nuclear reactor, for which you can find data on nucleate boiling without forced flow to support about 10W/cm^2 without much risk of reaching critical heat flux (in the strict sense, as you then transition to a vapor isolation due to the surface rapidly heating beyond the temperature where nucleate boiling is sustained, and from what I remember/believe, you won't transition back without reducing the surface temperature to be below that at critical heat flux, possibly even lower due to the lack of agitation/uniformity of the vapor layer).

          I did some research into what can be archived with suitable dielectric fluids and nucleate boiling to allow much higher heat flux on the surface of the silicon without requiring any energy input to actually separate hot and cold, but allowing the use of pumps and such to provide forced flow on the silicon. It seems that if you have a nozzle to provide sufficient flow speeds across the die you can get higher critical heat flux than can be reasonably handled through the power pins on an AMD EPYC socket, which has some limitations due to the LGA technology used for the contacts. A zero-insertion-force PGA socket should not be bound by this limit of temperature rise at the contact of the LGA spring/pin and the chip contact pad due to resistive heating resulting in the contact pressure declining (due to the spring weakening), which yields a feedback loop that can jump over to a domino effect on the other nearby power pins.

          I remember the flow rate being proportional to the critical heat flux as well as the distance the flow from the nozzle has to cross and provide cooling for. There were speeds of iirc. about 20m/s if one were to cool a delidded AMD EPYC at maximum power draw @4GHz, or rather, extrapolating from what can be archived with a die surface temperature of 50 degree Celsius.

          So, yeah, the heat flux is actually a problem, but I think one could integrate some sensors and sufficiently fast switches that shut the section of the die off if the temperature reaches dangerous levels (and do so fast enough to not damage anything, i.e. microseconds or what time there is), and use such a test chip to engineer a forced-flow direct-die (or maybe even with a heat spreader on top, but that costs you due to the nucleate boiling not going below 20 kelvin temperature loss, and additional system losses in the radiator/heat exchanger as well as pipes likely resulting in costs of about 5 additional kelvin). This could open the door to operating CPU dies at much, much higher power densities than currently normal. It's like a heat pipe on steroids. One would want circuity in the CPU to rapidly shut down in case the temperature rises quickly, as regardless of why this happens, not doing so will literally blow a hole in the chip before you can drain the inductors providing smooth power to the socket.

  • dnautics 6 years ago

    keep in mind that 'the surface of a star', for various definitions of 'surface' is actually "cooler than you might think", because it's rather diffuse and more than made up for in total output by the sheer size of the star.

  • Zanni 6 years ago

    Temperature inside a nuclear reactor: 300 degrees Celsius. [1]

    Temperature on the surface of the sun: 5,600 degrees Celsius. [2]

    Temperature of a CPU: ~75 degrees Celsius. [3]

    [1] http://academic.brooklyn.cuny.edu/physics/sobel/Nucphys/pile...

    [2] http://coolcosmos.ipac.caltech.edu/ask/7-How-hot-is-the-Sun-

    [3] https://www.computerhope.com/issues/ch000687.htm

    • readams 6 years ago

      not sure why you're being downvoted, but it's worth mentioning that the temperature is not what matters here but rather the heat generation per area. The temperature is a function of how quickly the heat can be pulled away as well as the heat generation.

      • Zanni 6 years ago

        Thanks for the clarification.

    • earenndil 6 years ago

      > per unit area

      A CPU is a lot smaller than both a nuclear reactor and the sun.

JudasGoat 6 years ago

The author states "For every watt of power the CPU consumes, it must dissipate a watt of heat." It was always my understanding that that in electrical devices, (with the exception of heaters) the amount of heat produced was inversely proportional to the efficiency of the device. So is it really true that all energy provided to the CPU or SOC is dissapated as heat?

  • Elrac 6 years ago

    The author's sentence is essentially a truism, because fiddling with information _as such_ and in theory doesn't use up any energy. As practically implemented in current CPU electronics, though, information needs to be communicated from point A to point B as a change in voltage. To convey that voltage change means having to move some amount of electrical charge into or out of the tiny capacitor that is a transistor's gate, via the non-perfect conductor that is the doped-silicon trace between the two points. Resistance saps part of the energy moving those electrons around and converts it to heat.

    • dnautics 6 years ago

      > fiddling with information _as such_ and in theory doesn't use up any energy.

      that's not true at all. Any time you destroy information (for example an and gate can destroy information) you use energy.

      • Elrac 6 years ago

        Chances are you know more about this than I do, and I'd be interested to learn more.

        If you're willing to explain, I wonder how an AND gate destroys information. Certainly the output of a 2-input AND gate carries less information than both inputs together. But unless the gate is destroying the input signals, which it isn't, I don't see how infomation is being destroyed.

  • jamesough 6 years ago

    Yep, it all ends up as heat. Energy is conserved, so if it didn't end up as heat, it would have to go somewhere, and there isn't really anywhere else it could go.

    Related topic: reversible computing. You might wonder if a computation need take any energy at all: it turns out that an irreversible computation (e.g. NOR, XOR, AND etc) is physically guaranteed to waste energy, but if you make sure your compute steps are always reversible (i.e. each input maps to one and only one output) then you can theoretically compute for free (the Feynmann lectures on computation cover this well).

    But these energies are negligible compared to the wattage that goes through a standard CPU.

  • cgdcraig 6 years ago

    That's true only of devices that don't convert the energy into work(motors), light(LEDs) or other voltages(transformers). All other electronics convert all the power into heat. Basically any computer is a heater that so happens is able to do math as well. Energy is consumed for doing a computation but it's 6-7 orders smaller than what the heat produced

blackflame7000 6 years ago

Why don't they start making CPUs 3 Dimensional like a cube with 6 "processors" each with multiple cores as its "sides" with the pins on the opposite sides of the cube wall. Seems to me more internal volume might allow for more cleaver head distribution channels

  • deepnotderp 6 years ago

    1. Heat dissipation is now ~n x m times worse where n is your transistor layer count. And where m is the increased thermal resistance*

    2. Power delivery is now ~n times worse where n is your transistor layer count. *

    3. Connections between chips are very slow, power hungry and expensive. Fabrication of "monolithic" 3D is temperature wise painful and usually results in crummier transistors.

    With that being said, innovative 3D integration methods in specific applications can help a lot. Shameless plug: we at Vathys do this for deep learning chips.

    * to a first order of approximation

    • tormeh 6 years ago

      The heat can be tackled in part by pumping water through holes in the CPU. I believe it was IBM that came up with this. Can't tell if it's feasible or not.

      • deepnotderp 6 years ago

        Yes but microfluidics limits how thin each die can be.

        • agumonkey 6 years ago

          What about stacked heat vias ?

          • deepnotderp 6 years ago

            What are those? Do you mean something like thermal "dummy" vias?

            • agumonkey 6 years ago

              yeah, something imprinted in all layers so you could evacuate heat

      • blackflame7000 6 years ago

        I was thinking by either creating a temperature differential on a copper conductor to chill the cube from within or that the motherboard/walls would provide cooling from the pin side.

    • deaddodo 6 years ago

      They do use some 3D techniques inside dies. Tri-gate transistors and 3DICs, for insurance

      • deepnotderp 6 years ago

        Yes well FinFETs and other non planar transistors are very different from what's being discussed here. Interestingly, FinFETs actually do suffer from a bit of a self heating effect although this usually isn't a problem for AC operation.

  • neilmovva 6 years ago

    That still means multiple silicon dies, which we have known how to do for a while (see: Intel Core 2 Quad from 2006, and more recently AMD Epyc).

    Having more dies lets you dissipate more heat, but then it's kinda hard to build low-latency / high-bandwidth interconnects between the dies. Inter-die buses go over a PCB or interposer, which impose higher parasitic capacitance and make it difficult/expensive to run wide interfaces. That's why techniques like "dark silicon" allocation are important - it allows us to get more perf in a single die.

  • gameswithgo 6 years ago

    less surface area per transistor makes the heat problem worse. they are already a little bit 3d though.

    • blackflame7000 6 years ago

      What if the cooling came from the pin side of the chip? Then the surface area is the same

  • asfasgasg 6 years ago

    If I understand correctly, neither heat dissipation nor existing manufacturing techniques are amenable to this approach. Also modern CPUs do have more than a dozen layers IIRC.

    • woah 6 years ago

      I think op is talking about 6 flat normal chips as the sides. This would allow for cooling stuff inside the cube. Maybe having only 5 of the sides as chips would make it even easier to have a heat sink. The center of the cube could be copper or something.

      • jacquesm 6 years ago

        You can't cool from the inside of that cube without having some way of transporting the heat out of it.

        All you'd end up doing is heating that inside up to the temperature of the dies and after that there would be no more cooling effect (and this would happen in a few seconds after starting the whole thing up). You could do an 'inverse' of this by cooling the dies from the outside and having the interconnects in the space in between. This would still require a lot of cooling and there would be an issue with connecting the resulting assembly to the underlying PCB.

        • jonhendry18 6 years ago

          I think woah means that the top would be open. So instead of a cube it'd be an open box. A heat sink (or arrangement of peltier coolers attached to a fan and heat sink, or whatever) would fit down into the box and be in contact with the five core-containing sides.

        • blackflame7000 6 years ago

          What if you used electronic cooling to chill a copper thermal conductor?

          • jacquesm 6 years ago

            Electronic cooling? Do you mean Peltier elements? They have a 'hot' and a 'cold' side, and their capacity is handily outstripped by any modern CPU. If that worked we would be using Peltier coolers to make quiet PCs today, after all, that problem is only 1/5th as complex.

      • rwmj 6 years ago

        Another problem is that companies (though not necessarily users) love thinner and thinner phones and laptops, and a cubic CPU wouldn't fit in such a device.

  • wolfgke 6 years ago

    > Why don't they start making CPUs 3 Dimensional like a cube with 6 "processors" each with multiple cores as its "sides" with the pins on the opposite sides of the cube wall.

    Not directly an answer to this specific question, but a (somewhat) colleague who writes his PhD thesis about 3-dimensional chip design made a popular scientific lecture about this topic. As I understood it, the central problem is that it is very hard to produce chips with multiple (lots of) layers (where you want to have interconnects inbetween). In particular producing the interconnects between the layers is really hard if they can lie "everywhere" instead of only at the border. There exist multiple ideas how this might be done (e.g. drill holes with high-precission lasers into the substrate and try to fill them with something conductive), but none of them as of today "really works" (at least if we are talking about chips with somewhat more than 4 layers and interconnects everywhere inbetween).

    • deepnotderp 6 years ago

      Fun fact: the through silicon via was invented by William Shockley himself

  • wongarsu 6 years ago

    For heat dissipation you want the most surface area per volume (because you can only transfer heat away in the surface area). The optimal arrangement for that would be a huge, flat, one atom thick surface.

    Another goal we have is low latency (and high clock rate, which is depenend on low latency), which suggests putting everything in a qube or even a shere. So we compromise somewhere in the middle with a square with a few layers.

    • perl4ever 6 years ago

      "For heat dissipation you want the most surface area per volume (because you can only transfer heat away in the surface area). The optimal arrangement for that would be a huge, flat, one atom thick surface."

      Made me think of this:

      https://en.wikipedia.org/wiki/Menger_sponge

      Say we have roughly 300 sq mm, that's about 17.32 mm square, which is 8.24e+7 silicon atoms (0.21 nm) across.

      Then we have a surface area of roughly 1.36e+16 atoms - the area x 2.

      If we make a fractal sponge down to the limit of single atoms, then that's about 16.5 cycles of removing cubes from a 17.32 mm cube. Let's ignore the difficulty of doing it half a time. According to the formulas from wikipedia, the result has a volume of about 0.7% of the original, with 2.4e28 "sides" of atoms exposed.

      So the third dimension gets you about 1.8 million times the surface area. I suppose this isn't nearly as good as 4e10 flat sheets with 1 atom separation between each, but you could argue it's more practical because everything is connected...

      • perl4ever 6 years ago

        Re "I suppose this isn't nearly as good as 4e10 flat sheets with 1 atom separation between each" - I guess it should be better, actually, now that I happened to notice 10+16 < 28. I got confused about whether my reference point was the basic cube or the flat sheet.

    • throwwit 6 years ago

      I think a dimpled fabrication would be best... with whatever amounts to heat sinks on both sides.

  • deepnotderp 6 years ago

    Such a "side cube" configuration would probably lead to longer wires than is useful.

  • burnte 6 years ago

    Just the reverse. As volume increases, the relative amount of surface area decreases.

bogomipz 6 years ago

I had a couple of question about this bit of history mentioned in the article. I'm hoping someone could shed some light on this:

>"You can emulate floating-point arithmetic by using integer instructions—but taking 10–100 times as long."

Exactly how is/was floating point arithmetic emulated using only integers?

Why is that range given an order of magnitude? Is this dependent on the precision I'm guessing?

  • brandmeyer 6 years ago

    If you have a copy of Knuth's Art of Computer Programming, the section on seminumerical algorithms covers software floating point.

    Another source is the Handbook of Floating Point Arithmetic, which covers both hardware and software methods in detail.

    An implementation in source code may be found in libgcc. IIRC their ARM assembly code versions are some of the fastest routines available anywhere for software floating point emulation. While they are in assembly, ARM assembly is fairly intelligible.

    Finally, you can probably come up with 90% or more of the correct solutions yourself if you start with the definition of a floating-point number as (-1)^s * 1.m * 2^(e - bias) and grind through the algebra with the bias as a constant. The theorems that prove the minimum number and nature of the guard bits to get correct rounding are another matter, but you can start off by computing the equivalent infinite-precision terms and then rounding.

  • phkahler 6 years ago

    We had floating point when programming in BASIC on old 6502 8bit computers. There were software routines for doing the math. You know, multiply the mantissa, add the exponents... If someone gave you pointers to a couple 4-byte chunks of data and told you to write code to do floating point multiply on the contents using only C-char variables, what would you write? That's why it's 100 times slower than a nice modern fmul. It wasn't quite that bad, they could use the carry flag which isn't available in C.

    • bogomipz 6 years ago

      Thanks these are all good reads and explanations. I guess I just have never had to think about FPU emulation before and it kind of threw me for a loop. It's amazing what we can take for granted now I guess :)

      Cheers.

  • pkaye 6 years ago

    A floating point number consists of 3 parts: mantissa, exponent and sign. These are usually packing into a 32-bit or 64-bit value. So you can split them back into 3 integers and do the integer adds, multiply and shifts to get the result and pack them back. It is not hard to implement a basic version but to make it fast and comply with the standard is a bit more work.

bogomipz 6 years ago

I had a question - the author states:

>"The most obvious is the instruction decoder, which is near the start of the pipeline, and is responsible (in the loosest possible terms) for passing the inputs to each of the execution units."

Why would it "in the loosest possible terms"? Isn't this "precisely" the job of the decoder?

  • kabdib 6 years ago

    IIRC in many chips the decoder stage translates native instructions into micro-ops, which are RISC-like and the main food for execution units. The translation is not necessarily a simple one (one native instruction is often more than one micro-op, and it's possible to collapse multiple native instructions -- especially stuff like prefixes -- into one or more micro-ops).

    • bogomipz 6 years ago

      Ah OK that make sense. This is likely what the authors means here by "loosely." Cheers.

jarym 6 years ago

We just need a clueless CIO to turn up and ask if 'Cloud' will solve the problem :D

  • stephengillie 6 years ago

    Cloud is busy facing off against Sephiroth. What's Batman up to?

  • melbourne_mat 6 years ago

    I love the cynicism :-)

    • jarym 6 years ago

      hehe - not everyone got it from the downvotes I got!