291 points by ccwilson10 2 months ago
With AMD closing the gap with Intel competition in the CPU market is back after a ten year abscense.
What I'm waiting for is an AMD GPU that can compete with a top-tier NVidia offering. Vega is nice, but not really a contender on the mid to top end. The G series cpus with vega inside are great, but where is the 2080ti, or even 1080ti, killer? Even something a bit slower, but close, would be great.
Is it too much to ask of AMD to handle both? I am unsure, but I would love to see NVidia in a price and performance war at the same time Intel is. Competition makes the resulting products better. Do you think the 9th Gen Intel chips would be octocore without Ryzen?
Yeah, despite owning 2080Ti and some Teslas, I'd strongly prefer AMD to be competitive in Deep Learning space.
I've heard that recently tensorflow-rocm can be installed using simple:
`pip install tensorflow-rocm`
Anyone did some Vega64/56/RX580 benchmarks with this? How fast is it in comparison to 1080Ti/2080Ti/Titan V? Can it do latest state-of-art models? Thanks!
Lambda labs has done some benchmarks.
> MIOpen is a step in this direction but still causes the VEGA 64 + MIOpen to be 60% of the performance of a 1080 Ti + CuDNN based on benchmarks we've conducted internally at Lambda. Let that soak in for a second: the VEGA 64 (15TFLOPS theoretical peak) is 0.6x of a 1080 Ti (11.3TFLOPS theoretical peak). MIOpen is very far behind CuDNN.
It would be great to do Deep Learning with AMD GPUs. Might do wonders for prices too as new nVidia are really expensive nowadays.
Of course, crypto might have something to do with this. I see miners offloading GTX 1060-s and 1070-s so perhaps the second hand market might become more approachable?
I wouldn't say AMD is closing the gap in CPUs. They're ahead of Intel and are widening that lead.
For whatever reason, Nvidia hasn't stagnated the way Intel has. Nvidia has continued rapid progress in GPUs even though they have a dominant position (and a near monopoly in the datacenter).
Perhaps because they don't have as much of a lead as Intel used to? Intel was a monopoly, plain and simple, but AMD has always been a contender when it comes to GPUs.
The market made it pretty clear that as long as Nvidia was increasing performance substantially with each release, they could collect a healthy margin of profit. There was no incentive to artificially slow down their progress. (Additionally, the benefits of GPU power were clear to gamers, and new benefits from parallel processing units became apparent in the areas of cryptocurrency and machine learning.)
Intel, however, ran into engineering issues with their process improvement, while having limited market-driven incentive to spend and innovate improvements to their efficiency (IPC) beyond a certain level (or to increase core count.) So their "tick-tock" cycle was broken.
Nvidia came to my place of work to do a presentation on the stuff they were working on around AI and Deep learning. More than anything, I took away the message that regardless of how large your budget is, they've got a way for you to spend all it on GPU compute.
AMD is still behind per core in both clocks and per clock performance. Intel has demonstrated the ability to just keep pushing cores to match AMDs largest offerings by just jacking up architectures they have been milking cheap for high margins for years.
TSMCs 7nm node will be the first legitimate tech advantage over Intel and if Intel does get their act in gear and manages to deploy their 10nm next year they will at least keep parity.
I am a huge AMD fan and will almost certainly be considering a Zen 2 build next year (though the fact I can disable Intel ME and can't remove AMD PSP will temper my interest) but thats more to support the underdog and get better value for money than getting the absolute best possible performance.
Intel has been the undisputed champion of process since the 80486 days so it's surprising to see how badly they're scrambling to hit even 10nm, let alone 7nm.
I would never have bet money that the tiny little CPU designed by Acorn Computers that ended up powering the Newton would be the first CPU to jump two nodes ahead of Intel in terms of process, but here we are.
ARM's doing great work and I hope they continue to push core counts to even more ridiculous levels.
(I think you've confused ARM and AMD, FWIW!)
it's surprising to see how badly they're scrambling to hit even 10nm, little lone 7nm
The feature size there is a bit misleading – "Intel 10nm" and "TSMC 7nm" are roughly equivalent, with IIRC the Intel 10nm process actually having a higher transistor density. The TSMC chips aren't really "two nodes ahead" – but it is likely that Intel will lose its process lead for maybe a year or so.
I get the impression that Intel overextended on their 10nm process, in that they were perhaps a bit more ambitious that other manufacturers and it came back to bite them when there were scaling problems. On the other hand, last I heard was that the scaling problems experienced with the 10nm node haven't held up Intel's 7nm node, which could well see them re-establish their process lead.
At the end of the day, it's great that the market is seeing some more competition, so hopefully we will all be able to enjoy the benefits from a variety of manufacturers soon!
> I get the impression that Intel overextended on their 10nm process, in that they were perhaps a bit more ambitious that other manufacturers and it came back to bite them when there were scaling problems.
I’m no micro-electronic expert but I wonder if we are hitting limits in clock speed scaling with regards to feature size - i.e. shrinking pass a certain feature size clock speeds actually have to drop for the chip to be stable.
Intel’s priority is clock speed first and foremost due to what they produce - desktop and server CPUs. A new process is pointless for them if they can’t get at least equal clock speeds out of it as their old process.
TSMC caters to mobile CPU and GPU production - those will never boost to 5Ghz like CPUs; the former for power efficiency reasons (and heat) and the later tends to go for more “cores” as it focuses on parallizable workloads.
As I understand it, it's not chip speed. It's chip voltage. Everything is a conductor if the voltage is high enough, and the closer the traces get, the less resistance the insulation provides. The problem is that at the temperatures we run computers at, the conductor traces need a fair bit of voltage to push the current through the entire chip.
The ratio of conductivities of insulators and conductors stays the same.
It's more that making many very long conductors with very short insulators between them becomes problematic. But that was the case at any process size, but now we are pushing the limits as far as possible to try to make bigger chips.
Not using silicon they won't but other materials might make that possible. I doubt TSMC will sit on the sidelines as those become more mainstream.
I don't doubt that Intel will get it together and remain competitive but right now they're really in a bad place. They're usually a step or two ahead, even when pushing ridiculous designs with no merit at all like the Pentium 4 or Itanium. To see them scrambling now to catch up is pretty much unprecedented.
> I think you've confused ARM and AMD, FWIW!
Quite possibly they are, but to be fair ARM processors were the first to launch on 7nm.
I'm talking about how ARM got their first on the process and now AMD has a chance to fab using that as well. AMD got out of the fabrication game, they couldn't keep up, which means they can use specialists like TSMC which are killing it now.
Intel's largely "secret sauce" process has been their greatest asset. Now it looks like a huge liability.
You know that Intel 10 is roughly the same, if not better than TSMC 7, right?
When Intel ships a consumer part at 10nm we'll talk. Until then TSMC is ahead because they're shipping.
The current generation i9 processors are all 14nm.
It's 'let alone'
> AMD is still behind per core in both clocks and per clock performance.
Yes, that's true but the per clock performance is close. They are only 5-10% behind in single threaded tasks without AVX (depending on the workload). The IPC increase is expected to be 10-15% (will of course depend on the workload). And their achilles heel, the AVX performance, will also improve with Zen 2 (256 bit instead of 128 bit etc.)
Due to 7nm the clocks (for consumer hardware like Ryzen and Threadripper) will probably also increase (not 5 to 5.2 ghz after overclocking like Intel, but up to 4.7 ghz overclocks could be possible seeing that 4.3 ghz is possible on the current node which is mobile optimized).
Depending on how much the clocks increase I believe they can close the gap. Maybe even pass Intel. The future certainly looks promising for AMD.
I wonder if they will revive their X APUs for the Server. In the past they had Opteron X APUs to increase the compute density of servers. Now with Zen and Vega this could be a nice combo in addition to discrete GPUs.
For example this system: https://i.imgur.com/yt5FasA.jpg?1
Imagine replacing these two 32 core Epyc CPUs made in 14nm with two 32 core Epyc APUs (Zen 2) made in 7nm, which would use the saved space due to 7nm for Compute Units, and you might get an additional 10-16 TFlops per System. Which is basically one additional GPU.
People keep thinking AMD's 7nm is this amazing thing, like everyone is talking about the same thing when it comes to CPU's and nm. When in reality nm has just been marketing fluff for a decade now, just like response times in monitors.
AMD's 7nm might get them close to on par with Intel's current 14++ nm chips, but it's not like AMD has really figured out how to make the entire CPU half the size.
People who follow the tech press talk about the same thing when talking about nm, and they know TSMC 7nm competes with Intel 10nm not Intel 14nm.
It's been all over the town for months that TSMCs 7nm is estimated to be worse than Intels ambitious failure that is 10nm  but quite a bit better than Intel 14nm (with the exception of clocks), and that 7nm+ with EUV for cost savings (which TSCM already taped out last month) is estimated to be equal or even slightly better.
So I'm not really sure what to make of your comment?
> AMD's 7nm might get them close to on par with Intel's current 14++ nm chips, but it's not like AMD has really figured out how to make the entire CPU half the size.
No they didn't, but they don't claim that, do they? From what they say they decided on the IO die exactly because IO doesn't scale as much, and that decision allowed them to double the number of cores. Since 7nm is expected to be much more expensive than previous nodes this seems really clever from a money standpoint as well. The core only Zen 2 chiplets are expected to be around 70 mm² which is mobile SoC territory.
AMD is already close in IPC to Intel even though AMD uses a worse node (GloFos mobile optimized 14nm is more like Intel 22 then Intel 14nm) and wins in multithreaded workloads because their SMT implementation seems to scale better than Intels. They also seem to have better performance/watt when under load. I have not seen numbers for idle wattage for Xeons but Intels desktop CPUs are slightly (5 - 10 watt) better when idle.
So I'm looking forward to them having the better node for the first time ever.
> TSMCs 7nm node will be the first legitimate tech advantage over Intel
AMD's Zen architecture is leaps and bounds above Intel's offering, not only performance-wise but also where it matters the most: product design and production costs.
>For whatever reason, Nvidia hasn't stagnated the way Intel has.
Nvidia doesn't fab their own chips and never had a process lead.
People underestimate just how huge a deal Intel's traditional lead in fabrication tech was. I've long argued that the real casualty of Intel's anticompetitive tactics in the early 2000s was AMD being forced to spin off GloFo. Far more than AMD's near term marketshare at the time, it lead to a situation where AMD couldn't really even fall back to their traditional position of competing on price at the low end of the market and played a direct hand in Intel's decade-plus domination of the market.
>Nvidia doesn't fab their own chips and never had a process lead.
I would argue that is nothing to do with Intel stagnation. Look at Intel's leadership and management. Look at Jensen Wong. The last time Intel had any energy at management level were Pat Gelsinger, and they pushed him out.
Nvidia hasn't? Looks like the big PR push for the GTX 20X0 has turned into double the price and 15% increase in performance on the average game.
Well yeah, specifically the gaming sector seems to be driven by huge margins for tiny gains. The volume production will go to industry.
>I wouldn't say AMD is closing the gap in CPUs. They're ahead of Intel and are widening that lead.
Intel is still king in high-end gaming performance, albeit not as much as Nvidia is ahead of AMD in GPUs.
When I built my PC it seemed like Intel had a pretty small single core edge but amd had twice the cores per dollar. Easy choice if you do anything but gaming imo. I really hope they catch up the GPU space soon.
Interesting enough, a lot of productivity PCs still go for Intel in the form of the 18-core i9-7980XE. AMD isn’t necessarily a shoe-in in productivity either.
“A lot of productivity PC” builds are going for $1700 Intel CPUs... really? Even if the $1700-CPU PC market was really popular, tell me: why would they choose the slower of the $1700 CPUs, other than corrupt or beurocrafic business practices?
And don’t try to argue how Intel’s $1700 18-core CPU is faster than AMD’s similarly priced 32-core CPU because Intel’s has slightly faster per-core performance. Such an argument would be absolutely absurd: the point of an 18-32 core CPU is NOT the single threaded performance :)
This guy uses it: https://youtu.be/jweQNDCe218
Don’t know about you but his builders actually benchmark the CPUs.
Dansgaming uses the 18 core i9 for his streaming box.
Not every application a person uses scales to high core counts, in such situations a CPU with good single thread performance (in addition to high core counts) would be beneficial.
Cache performance matter too. AMD’s CPUs have a split L3 cache. Some applications might not like that.
> Not every application a person uses scales to high core counts
But do you know what really scales to high core counts? Running more than one application at a time.
> Running more than one application at a time.
Depends on how much memory bandwidth and how much CPU cache your applications are using.
Wouldn’t what to trash the L3 cache and worsen the situation with a saturated memory bus.
> Intel is still king in high-end gaming performance
Not really if you consider the price per core. Normally designed games utilize all cores, and something like Ryzen 7 2700X provides a major benefit. Comparable Intel CPUs are a lot more expensive. Their only advantage is higher overclock frequency. But if you need to overclock your CPU to play something, that game is already poorly designed and is probably not using all cores properly.
> Normally designed games utilize all cores, and something like Ryzen 7 2700X provides a major benefit.
No they don’t. Most games barely scale to use 4 cores - some struggle to use even 2.
Even games with tons of threads tend to have a one thread that is ultra heavy which become the limiting factor - i.e. you need single thread performance.
It’s moot regardless since Intel’s i9-9900K has 8 cores and 16 threads too.
> Most games barely scale to use 4 cores
It means they are poorly designed which is exactly my point. It's not really a measure of CPU quality, but rather the measure of those games quality. Normal games today use something like Vulkan to saturate the GPU and should not be CPU bound.
So if you need a single thread performance that requires overclocking, it's a poor engine design.
> It’s moot regardless since Intel’s i9-9900K has 8 cores and 16 threads too
And costs a lot a lot more. That's why I mentioned price per core above. I'd use such price difference to get a better GPU instead.
Unfortunately, time is at a premium in game development - the amount of crunch is already absurded.
Multithreading hasn’t gotten any easier.
Even when games are forced to multithread like on consoles. Said games run on PCs with half the cores (admittedly at nearly twice the clock speed and higher IPC) outpace consoles with 2x the frame rates.
> And costs a lot a lot more. That's why I mentioned price per core above. I'd use such price difference to get a better GPU instead.
Of course, it’s the best on the market. Intel would be stupid not to charge a premium. It’s how such things are priced.
> Unfortunately, time is at a premium in game development - the amount of crunch is already absurded.
Basic software architecture is not handled in crunch time.
A complex architecture will nonetheless take longer to implement and debug.
My point is, for gaming there is no need to spend so much money just to get higher single core frequency. There are some games that are very poorly optimized, but I see them as edge cases which you can skip if it becomes an issue. Most games don't require overclocking really.
That’s correct. I never buy the top of the line because I know it doesn’t have a good cost:benefit ratio.
BUT there are people that want the absolute best available and have the money to afford it ... /shrug
> There are some games that are very poorly optimized, but I see them as edge cases which you can skip if it becomes an issue.
There are a lot of games that aren’t well threaded.
Well multithreaded games are primarily by rich AAA developers - and not even all of them do it; some just don’t have the programming talent for it and some have games that have ran for decades that are too old to multithread without rewriting the whole game.
PS: Sorry for late reply. Apparently people disagreed with me and I had negative Karma for a while. Which slows down posting?
Not sure how it works. It can delay you even without negative karma.
Most game source code I've seen has exhibited this "poorly designed" trait. Some because it was originally written in a single-threaded context and continued to provide shareholder value, and others because it didn't have high enough performance needs to utilize parallelism.
I think that will slowly change over time though, especially for big-budget titles that want to scale with performance better. Architectures like Unity's Job System and the specs package in Rust with a stronger emphasis on staged data processing can help with utilizing cores and cache.
Interesting, I've never heard of ECS before.
> They're ahead of Intel and are widening that lead.
For awhile in the early 2000's, AMD's CPUs were supposedly better than Intel's CPUs. There was a lot of doom and gloom predicted.
I briefly worked for Intel during this period. At an internal quarterly meeting, they shared some confidential information. It was very simple, and very damning to AMD. (In short, Intel very quickly came back on top for reasons that were very obvious to anyone paying attention.)
I'd love to be a fly on the wall at Intel right now. I wonder if they really are falling behind, or if they know some things that we don't?
I'm sorry, that's deeply revisionist history. Intel kept their dominance through illegal market practices , despite AMD's tech advantage at the time. Intel eventually paid out >$1B, but by then the damage was done and it would take AMD almost 10 years to come back.
> I'm sorry, that's deeply revisionist history. Intel kept their dominance through illegal market practices , despite AMD's tech advantage at the time.
While that may be true at the time, it doesn't negate the point that Intel knew internally, long before others, that technically superior solutions where coming.
And, of course, that's exactly what happened.
Are you skipping over the part about illegal business practices that have nothing to do with the technology?
But illegal business practices only have impact on economics. They don't change the fact that Intel had a technologically superior solution in the pipeline that would eventually trump AMD's offerings.
That would have resulted in Intel taking the economic crown anyway, irrespective of their illegal business practices.
> But illegal business practices only have impact on economics.
That's patently false. As OP stated, the biggest impact of Intel's illegal business practices was getting rid of any competition for over a decade in spite of having a technically inferior and underperforming product line.
I think your comment would sit better if you indicated the information, since otherwise it just reads "he said, she said, and what they said was FUD."
Why does he need to say that?
We already have the benefit of being able to look back.
Isn't it totally obvious today that Intel had non-Netburst successors to in the pipeline that were superior than anything AMD had to offer?
I'd rather not repeat confidential information in a public forum, but the advantage had very little to do with a specific feature, technology, or performance.
Was that information "We have finally accepted that NetBurst is a failure and our next chip will be a more conventional design that should outperform AMD's chips"?
"Our team in Israel also has a laptop design that's based on the P3 core, which, we believe, will be great for the desktop as well."
The other portion here. Is the disagreement with Radeon Technology groups and AMD.
I can't find the other news articles. But a number of Radeon group employees have moved to Intel.
A lot of the tension I read was that the Vega team was ransacked. For console engagement chips i.e. Playstation 5 and the new Xbox. There was also speculation of disagreement between Raja and Lisa.
I'm also waiting for a higher end AMD gpu, but will probably grab a 5xx series in the future to tide me over.
The basic principle of GPU, performance scales linearly with transistor count and die size. Since GPU is nothing more than a massively parallel beast, the more you throw in the better.
You can't really expect a 400mm2 GPU to compete with 800mm2 square GPU. So unless AMD made a monster size GPU they will never be able to compete directly with Nvidia.
So why doesn't AMD make one? Economies of scale. Nvidia could afford the huge price of design, testing and (relatively) low yield of an 800mm2 chip, as long as they have customers buying bulk of it. Nvidia is basically enjoying all the Deep Learning / Machine Learning Money buying to their CUDA ecosystem, they could afford to make such bet and they are selling it as fast as they could make them.
AMD doesn't have this luxury, and Lisa Su knew that well, that is why they could only compete in segment that makes sense. Until the day ROCm can compete directly with CUDA, and its demand are high enough before AMD could afford doing a monster die size chip. But AMD already has plan to use the same Chiplet strategy for GPU, and hopefully everything learned with EPYC will fully be used for these GPU.
AMD's resources is limited and they selected a proper priority
1. CPU first, GPU next. as a break through in CPU side is easier than GPU side - just go with more cores with chiplet, since intel basically stopped innovation, while GPU side will be much tougher.
2. data center first, consumer/gamer second. Vega is not meant to compete with best Nvidia card, but it was designed to handle both data center/ML needs and gaming need, maybe the gaming version is just a space holder. The data center version will bring more profit and buy time for AMD to develop the software ecosystem -- CUDA is the moat of Nvidia, and AMD need time to overcome that.
So 7nm is used on data center version instead of a gaming card, which make perfectly sense for AMD.
Isn’t AMD well positioned to make a hybrid unit which works well as a CPU and GPU?
> Do you think the 9th Gen Intel chips would be octocore without Ryzen?
Recently got a new work laptop, a 12” with a 4C/8T i5. I definitely wouldn’t have expected that without Ryzen on the market, which also is available in business laptops from Dell in a 4C/8T config.
But is Ryzen really on the laptop market? There are just few low-end models, nothing that I would consider buying, so for laptops I'm kind of stuck with Intel.
Well, not really, only low end and a bit of midrange. No chance to buy a trivial retina display with any Ryzen U. Like they just dump excess inventories of TN and HD IPS panels at whatever AMD has to offer, even if Ryzen U is far more suitable to 2.5k/3k/4k screens than any Intel UHD.
AMD, please consider making your own premium notebook brand to teach your 3rd party manufacturers what your APUs are capable of!
The Dell models we have access to can be configured with Ryzen 7 Pro with Vega GPU and a 4K panel. Is that midrange?
OK, thanks, first time I've heard about something like that! Will check it out (preparing money now)...
Do you know the exact models?
The one I see is a 14" Latitude 5495, configurable with a Ryzen 7 Pro 2700U. I was wrong about the 4k screen, it's 1920x1080.
Alright, a glimmer of hope for a decent Ryzen notebook has just died :-/
What’s decent? I do HPC, data science and dev work, and that laptop is more than enough.
Specifically retina-style display. I have a perfect vision, can't go back to 1080p and use all-retina/HiDPI screens exclusively for 5 years already. Even got the very latest LG 5k2k ultrawide display yesterday.
Wow ok.. complete opposite, I intentionally chose a lower res display because I find it easier to read, obviates endless resolution problems and uses less battery.
Even without actual Ryzen chips in laptops, we get the benifits. The Thinkpads and XPS series having Intel, low-power, quad cores is a direct result of the desktop chips going to six and eight core.
Is there a 6-core mobile (low power) Ryzen CPU for laptops? I don't think there is, which is a pity.
I think it's impossible, because the whole concept of Zen architecture is to produce 4-core dies, and then glue them together to create a gigantic 8-core or 16-core processor. There isn't 6-core (or 8-core) Ryzen mobile processor, because it wouldn't physically fit inside the laptop.
But Rome architecture, unveiled yesterday uses 8-core dies, so there's hope.
I was looking into getting an A485 Thinkpad which has a Ryzen in it. Sadly, from reviews it sounded like AMD's platform isn't as good at getting into really low power states as Intel is and that shows up in battery life.
Yep Dell has Ryzen Pro 3/5/7 laptops, with Vega GPUs with same TDP as Intel machines.
edit I almost ordered one but went with a different model to choose the keyboard layout. So my point is that they do exist.
I saw some pretty slick business models come out from HP. Actually held one that somebody showed me, it seemed well built.
It'll obviously take time for high quality AMD-based laptops to become a normal thing, but with the CPU being en par and the GPU clearly being better than Intel's offering, it should really only be a matter of time.
I think though with Dell offering laptops, desktops, and servers with AMD components the time is already here.
OTOH maybe this Dell stuff isn’t on consumer market yet.
Yo! I just bought an E485 from Lenovo, and it's definitely not "low". 4C/8T 2.5 GHz, Vega 10 graphics, 16GB RAM, 128GB SSD and 1080p 14" matte display.
The A485 is better spec'ed with docking capability, ports, and an external battery.
Linux works on it with some tweaks, and in 4.20 full support is added.
Granted, it's harder to find, but I am hopeful that with Ryzen performing well, and their Vega GPU beating the crap out of Intel integrated and competing with Nvidia MX150, that we'll see more respect from laptop makers.
tl;dr Buy a thinkpad.
The mi60 is not in the same class as the 2080ti. It's a 331 mm2 chip competing against a 775 mm2 one.
I would think it's tough to recapture significant marketshare & performance parity in two markets at the same time.
Hopefully as revenue in CPU's increases, some of that can then be invested in R&D on GPU's.
This wouldn't be the first time AMD has ceded the top tier to NVidia but still been pretty competitive in the mid range and down. This was what they were doing in the HD 4000 series era.
> Is it too much to ask of AMD to handle both?
They are working on new GPU architecture to address it. It's something post Navi supposedly.
What happened to Intel? Have they given any official explanations yet?
How does a company, who for decades has been 1-2 node processes ahead of the industry, suddenly gets 1-2 node processes behind everyone else.
There must be an amazing story behind this that no one seems to be digging into.
My take on it is that it is fairly complex.
First there are 'node wars' where what it means to be a "10nm" process or a "7nm" process has become rather murky. This is because transistors themselves don't work well at these sizes and you start getting novel structures which make it hard to compare things. Back when all transistors were flat rectangles it was easier but now they all have some amount of a verticalness to them (FinFets) and there are various patents around this stuff and so nobody talks about 'transistor' size any more they talk about 'feature' size. But what is a feature? Is it a polysilicon line? (equivalent to a trace width on a printed circuit board) Or is it the smallest thing you can render with your lithography process?
But all of that explains when you step back, what has happened to Intel. It used to be that what Intel was doing in there fabs other fabs would take 2 - 3 years to do, and they were big differences, like copper, or finfets, or smaller feature sizes. But as time goes on, the features get harder and harder to develop so when you're two years behind the difference appears smaller and smaller.
What is more the cost of trying something that doesn't work out is more and more expensive and time delaying. And costs are huge here.
That introduces part three of the puzzle, as fabs have been closed while companies switch to using TSMC we have gone from having a dozen semi-conductor companies spending their R&D budgets on their own process improvements in competition with Intel, to those dozens of companies sending their R&D dollars to TSMC who then in aggregate can spend more on R&D than Intel does while still being profitable enough.
So the bottom line is that the market has settled out and there are just few giant foundries (Global, TSMC, Intel, Samsung, Etc.) and one of them doesn't make a business out of others use of their equipment (Intel). Worst, the biggest consumer of silicon has become phones and Intel isn't a serious player.
Intel is under siege and I think they know it (they certainly act like they know it).
So all that past myth, of Intel being a uniquely talented company as the reason for Intel's leadership was false ?
And it was mostly about the standard "experience curve", i.e. the fact that larger volumes being manufactured generally lead to better results ?
In my experience, no company is uniquely talented forever. The reasons for that are complex as well (Christensen did a good job of sketching the mechanisms in the Innovator's Dilemma). Every time you have a CEO change there is a new vision of the "secret of this company's success" which is likely not the same as the previous CEOs vision. As a result different things get emphasized, or priorities get shifted, and then the next thing you know that center of excellence isn't as excellent as it once was.
From an organizational dynamics point of view it really shows the value of process as a means of preserving institutional integrity and durability, but we're humans and communities change. So companies change, some employees leave, some new ones show up, and the mix may not be as effective as the previous mix in getting stuff done.
I really admire Dan Warmenhoven's management philosophy which was very low politics. People serve their ambition in one of two ways, by lifting up the community around them or by pushing everyone but themselves down. The latter type destroy companies and senior management's role is to be the antibody that detects and then removes the offending folks.
So you need to build a healthy community of employees who are working toward lifting the company further. The people who do that, and the community itself, however move on eventually, and the special quality of the group can erode over time.
>I really admire Dan Warmenhoven's management philosophy which was very low politics. People serve their ambition in one of two ways, by lifting up the community around them or by pushing everyone but themselves down.
Is there a book about it ?
Just because a company loses its competitive edge doesn't mean that they weren't better at what they did at one point. Sears just filed bankruptcy, but that doesn't take away the impact they had on retail throughout the 20th century.
No one could take Intel's role in history.
But why did they have such a role is still an interesting question.
The explanation I've seen most frequently is that process sizes aren't really comparable between fabs any more. In this case, that would mean that Intel's 10nm process is equivalent to another fab's 7nm process.
Comparing processes between fabs might not make sense, but the real question is — does it still make sense within a single fab? If so, then it's still very much the case that Intel, for whatever reason, has been stagnant while AMD has been moving forward at a fair pace, which still lends credibility to the narrative that AMD is closing the gap and Intel is losing some of its lead. Of course, as has been the case for decades, it still remains to be seen whether this will translate into AMD taking more of the market.
Is (Transistor Count / Die Size) something that produces a meaningful number?
Not really. E.g. the AMD ZEN+ moved to a 12nm process but kept both transistor count and die size constant. They instead used the headroom to space features out and improve cooling.
I think the HDL guys were running around with their hair on fire for Spectre bugs, hence why it was a straight transistor shrink with next to no logic changes. It's previously unheard of to not take advantage of a process shrink with logic changes; so much of your logic decisions are ultimately rooted in the process node.
You could still specify the maximum possible transistor density for the process. It doesn't mean a concrete design actually has to use it. Or make it an SRAM bit, because caches take up the bulk of the area anyway.
Pretty much equivalent to SRAM cell density, which you can measure under a microscope.
Speaking of SRAM, it's interesting how much effort is devoted to keeping the logic busy so they don't have to spend chip area on cache.
It's a useful number but its not the whole story. Fitting more transistors into a given area lets you put more chips on a wafer which is good economically. And it correlates with performance but, for example, Intel has traditionally accepted more restrictive design rules in exchange for more performant transistors and that has hurt their effective transistor density even if their individual transistors have been fast.
It gives you a bound on the areal density of the transistors, which correlates with their switching and power performance.
The problem is you need to be able to produce it in volume. The original projected 10nm is better than TSMC 7nm, that is assuming the 2019 10nm is still the same, which rumoured is not. Will be up against TSMC 7nm+, the next generation of 7nm.
Let's just assume they are both equal in absolute terms. By Late 2019, Intel would have barely launched 10nm and possibly shipping in 30 - 50M quantity ( And I think even that is an optimistic number ). TSMC wold have shipped more than 300M 7nm across their entire 7nm generation.
And TSMC has 5nm ready in 2020. I don't think Intel will have their EUV 7nm ready even in 2021.
Combined with the fact there is exactly only ONE, one EUV equipment maker on the market, ASML. And they have limited capacity in producing these ASML machine. As far as I am aware all of the 2018 and 2019 capacity are already locked to Samsung and TSMC.
Yeah, but that's not a good one, since they have been using the same process for 4 years if I'm not mistaken.
Whatever metrics are used, they are stuck on it for way longer than they should.
Yeah, I don't think anyone would argue that Intel has been having issues in recent years shrinking their process. The interesting question would be whether other fabs would end up facing similar issues with their next process node.
The rumor I heard was that Intel's shrink to 14nm was pushed really hard, and as a result a lot of key talent quit. While I'm not sure if it's true, it would explain a lot.
I'm not trying to be snarky; I actually don't understand. Isn't a nanometer the same between fabs?
A nanometer is the same everywhere, but what you’re measuring isn’t. When they say 7nm, are they talking about the smallest feature they can produce, the minimum wire size, the minimum transistor size, the average transistor size, or...?
For an analogy, a GHz is a GHz everywhere but that doesn’t mean a 3GHz CPU is always faster than a 2GHz CPU.
If AMD can suggest that they are on smaller process size because they are measuring a smaller feature, why wouldn't Intel just start measuring the same feature on their chips? I have trouble believing they would stick to some principle about what is the right feature to measure at the cost losing out on marketing themselves.
7nm does not refer to any feature size. Process node names have continued to follow the pattern of the next node being named as roughly the current node divided by sqrt(2), even though density increases are no longer coming from simple uniform horizontal shrinks.
They might, just like the MHz marketing wars. But for now consumers don't care about "7nm" as a marketing figure enough for them to care if it's actually 7nm or not.
Like others wrote, there's no longer a standard for what the measurement actually means. Most structures aren't actually 7nm in a 7nm process.
For example, a typical metal pitch on the low metal layers is 40nm, meaning you get one wire every 40nm, or 25 wires in parallel in a 1um channel.
What Intel is calling 10nm does indeed appear to be close to the others' 7nm. Then again, Intel is seriously behind on 10nm, so the bottom line remains the same: they seem to have essentially lost their process advantage.
Then why do they continue to use it at all? It's like measuring your electric car in the number of pistons it has. We all agree process node numbers are meaningless now so why do we keep using it?
How about a number that actually relates to the performance and can be measured?
My best guess is a combination of historical inertia and the fact that the names are actually meaningful, just not in the way that one might naively expect as an outsider.
When the foundry sends you a design kit which contains all their design rules and tooling around a process, then this process has some codename that appears everywhere (think filenames, names of library elements, and so on). This codename tends to be something like GF14 (for GlobalFoundries' 14nm) or N7 (for TSMC's 7nm) plus cryptic suffixes for different revisions of the foundry process.
So the 14/12/10/7nm terms are actually part of the design engineers' everyday work flow. They just also filtered through to marketing for whatever reason.
I could imagine that at some point in the future, foundries will switch to a year-based versioning similar to what happened with a lot of software. So you'll have a GF2027 process and so on. That's pure speculation on my part though, and inertia is definitely a thing.
Nanometers is a marketing term now, just like frequency used to be a marketing term when talking about CPUs.
In reality, size of various features in a CPU differ widely. Intel 10nm could have a transistor gate pitch of 50nm, while TSMC 7nm could have a pitch of 60nm. All the meaningful parts you care about, like size of the transistor components and interconnects, are _not_ small, and it every company designs their own tweaks on these building blocks for reliability/manufacturability/performance/power/etc.
Read this and follow links: https://en.wikichip.org/wiki/7_nm_lithography_process
Those differences seem enormous.
SRAM bitcell, High-Density (HD)
A Intel 14nm: 0.064 µm²
B Intel 10nm: 0.0312 µm²
C TSMC 7nm: 0.027 µm²
Is B even in production at all?
AFAIK there's a single 10nm Intel processor available – the Core i3-8121U. Performance is sub-14nm though.
It is. But what exactly you measure in nanometers is different, depending on how you define the "size of the structures"
A nanometer is indeed a nanometer. The question is what exactly is being measured, and that is what differs.
Who's explanation is this? Do you have a link?
Unfortunately, I don't know. As I said, it's just the explanation I've seen most frequently when browsing HN, and I'm not knowledgeable enough to know where to find reliable sources. Sorry.
Every semiconductor analyst has been on this story for the last two years. https://semiaccurate.com/tag/10nm/
Without a satisfactory answer in my opinion, which is my point
Semiaccurate actually has some pretty detailed dives into exactly what is going wrong with Intel's 10nm but it's all subscriber only. They've also done some reporting on leadership problems inside Intel that I think make the execution failures more understandable.
Thanks. I’ve heard this podcast with Ashraf Eassa, which I recommend listening, but IMO still fails to paint the complete picture. Maybe we’ll only know for sure in a few years.
I think that Intel was ahead only because their decades of dirty tricks[e.g. 1,2] gave them a sort of artificial Monopoly.
Now AMD is (rightfully?) pulling ahead.
Intel wanted to do a number of ambitious things with their 10nm node such as putting contacts over gates and using cobalt for wiring. These were gambles of the sort they had made before but this time the advances just didn't work out. Now they're trying to do a 10nm process without CoG and I hope it works out this time.
The end of Moore's Law happened. Intel is not behind, AMD is likely reaching it by now (as hard as it is to compare the processes).
Moving ahead gets much harder the more you advance, so it may even be that Intel still has a 20 month lead from AMD, but since it is so much harder to move, the same lead becomes a small difference.
You answered your own question. They were ahead for a long time so they got complacent.
But the story we want told is who got complacent? Was it the C-levels making budget decisions? Was it engineering staff that left for greener pastures or was it R&D decisions that ended up at dead ends with nothing to show for it?
Its easy to point to the big player and say they lost their lead, but the fine details about who made decisions to land them there is the story we want told.
If anything, the truth seems to be the opposite. Their 10 nm process was so aggressive (e.g. 2.7x density) that they couldn't make it work.
One slide noted that their 7nm processor had 13.28 billion transistors per 331 mm^2.
Although 7nm/10nm have mostly become marketing terms, is this something directly comparable between fabs/companies?
Does anyone have a comparison for Intel? Best I can find is this list: https://en.wikipedia.org/wiki/Transistor_count (and if that's accurate and up to date, it looks as though AMD will be well ahead).
The best way to compare CPUs would be performance/watt benchmarks which is what matters ultimately to the customer; in this case, data center customers. Let me clarify - we are not talking about Geekbench type benchmarks. Each customer has their own validation & qualification process for new datacenter chip procurement.
Public gets hung up with all kinds of marketing terms (7nm, 7nm+, 10nm, 10nm+, 10nm++, etc). What does 7nm+ even mean!? It is purely a marketing term and large datacenter customers know this. They run their workloads on test samples and make a decision to go with Intel or AMD.
Furthermore, there is also the aspect of maintainability, servicing and infrastructure inertia that is priced into Intel's server chips. Apple-to-Apple chip comparison (sorry for the pun, not intended) from Intel & AMD would not be priced the same since Intel knows that there is a giant amount of switching inertia for a customer to switch to AMD Epyc. Furthermore, datacenter customers want predictability and proven performance. In this case, Intel again wins with its history and you betchya its modeled in the pricing.
So, this is all business as usual. HN loves beating on Intel but their numbers in quarterly reports depict a different story.
Let me repeat: No sane customer gives a shit about 7nm or 10nm. My comments are only applicable to datacenter customers. Desktop/Client chips are a whole another enchilada where marketing plays a bigger role (have you seen the ridiculous packaging from AMD & Intel? This is to please the RGB Gamer crowd).
The feature size is relevant for understanding how the chips progress over time. It's not relevant for a point-in-time purchase, but it's not a marketing thing to ignore either.
The plus means a generation of optimization on how to use roughly the same lithography tech, which can give you a big difference when they're so hard to use.
It really doesn't matter. What if going from 10nm to 1nm means you need to use simpler features, reducing performance?
CPUs won't necessarily evolve toward smaller feature sizes - it's just what we've seen so far.
You can always use your 1nm tech to craft more precise circuits of the same size, packing them better. When it comes to the working chips that are made, assuming some tighter lithography that's equally functional to the previous one, it's only going to benefit.
As I understand it, Intel's 14nm+ and ++ process revisions actually increased the effective feature size compared to their previous 14nm processes. As you say, transistor density isn't everything.
These links might be useful:
^ err. Actually, those figures are for the GPU. Nevermind.
This page  puts the 10nm Intel density as 100.8 million transistors per mm^2. Your number is 40.1 million/mm^2.
This is really fun to watch. AMD is giving people EXACTLY what they want (again), and intel is having to fight dirty (again).
Possible example (not at all out of character for intel): why are so many people parroting that 10nm technical superiority junk without supplying sources?
Just to be clear, the discussion about 7nm vs. 10nm is not junk, though it's hard to get concrete sources – WikiChip has some useful data (https://en.wikichip.org/wiki/10_nm_lithography_process)
Basically it seems like Intel kind of over-extended on their 10nm process by trying to introduce a bunch of new techniques, and they had trouble scaling this to volume production. But I think it's generally accepted that the Intel 10nm process and other manufacturers' 7nm processes were broadly equivalent, and it seems unfair to accuse people of being shills for thinking so!
I have seen people supply sources at least twice, frankly I'm amazed given this comes up 5 times per day here that we have to keep explaining it.
Here is one source, again:
These are very nice and detailed feature figures, but I do not see that intel has any advantage from them (possibly my reading comprehension is failing here...). Can you cite a source that clearly shows intel is manufacturing and selling something better than the competing fabs?
they aren't selling anything at the 10nm yet, who knows how it will perform!
but they can fit more on their chips at their process they call 10nm than the others can fit on their chips that they call 7nm.
So it is correct to say that assigning a single size to a node is misleading. There are many dimensions you can measure.
They are selling something on 10nm:
Increasing the vector width to 256 bits (assuming no crazy thermal throttling) is a pretty big deal and would get me to move off Intel, unless Intel can figure out 512 bit widths without massive throttling.
That's really a matter of Intel's 10nm process (which is roughly equivalent to TSMC 7nm).
AMD used 128-bit and simulated 256-bit by doing 2 passes. This reduced peak power consumption and allowed them to keep clock consistently high. That matters because while your AVX is going slowly, your non-AVX is also going slowly. There was simply no way x86 could do vectors that wide on 14nm without throttling.
With the 7nm shift, AMD can use the reduced size to increase to native 256 at full speed (and they may do 512 in 2 parts). I expect Intel to do the same when they get replace their 10nm process with something that works. It'll probably be a couple more shrinks though, before 512 can be run at full-speed.
This AMD card can compete with NVIDIA's high end Tesla V100 accelerator.
At 7.4 TFlops of double-precision, it is smack in the middle between the PCIe version of the V100 at 7.0 and the NVLink version at 7.8.
Memory bandwidth for the MI60 is a bit better at 1000GB/s, compared to the Tesla V100's 900GB/s.
However, AMD's problems are usually not the actual hardware, but the software around it. NVIDIA has done amazing work with CUDA and the surrounding frameworks, while AMD has not really. They really need to catch up on software that makes writing code for their GPUs more trivial.
64 core EPYC chips based on Zen 2 is what really blows my mind.
Threadripper 3 with 64 cores is going to be mindblowing! Not that long ago since Parallella board advertised 64 slow cores and soon we can get all x86/x64 high-end cores like that!
It is pretty crazy. I felt the same way. Individual x64 cores tend to be so much more powerful than other architectures, and now single chips will effectively have 128 logical cores.
For my purposes (large builds and rendering), I think RAM prices are holding back AMD here. To feed that many cores, you want really big RAM sticks. The CPUs have become a comparatively small cost compared to the RAM these days.
I've recently built a TR-based DL/ML workstation and bought 128GB ECC 2,667MHz UDIMMs for ~$1600, roughly the same price as 2990WX, but would have vastly preferred to get 256GB instead. Unfortunately, only Samsung is now sampling 32GB ECC DDR4 UDIMMs - I haven't seen them anywhere yet, and I expect the price is going to be insanely high :-(
Speaking of insanely high RAM prices, I just came across receipts for a PC I built in 1992. So 26 years ago, I paid $495 for 4MB. Yup, that's MB, not GB.
Admittedly these were AUD rather than USD. So maybe halve that for the USD cost.
When we complain about how expensive memory and compute, a slightly longer term view shows it's still pretty good value!
You realize that it's not about compared to 25 years ago though, right? When I looked at the beginning of this year, the same RAM size and speed was about twice as expensive as it was two years ago.
The most important bit WRT to TR3 is going to be the central I/O chiplet instead of dividing memory controllers between individual Zeppelin dies. No more NUMA headaches to deal with on their workstation/enthusiast CPU's, I'm glad that AMD saw that such an approach wasn't going to work long-term (at least not for the time being when basically anything outside large database systems and hypervisors lack even basic NUMA-awareness).
Do you know if that would allow all cores to have the same memory access speed like the current (16c in 2990WX) directly connected ones, or if it imposes a penalty (the same?) on all of them?
It's really hard to say what the memory latency is going to be, but at the very least this will mean that latency will remain consistent for access to every installed DIMM regardless of which CCX the request originates from.
On that note I'm really interested to see if a dedicated I/O chiplet will help with the memory frequency scaling issues with see with the IMC on Zen/Zen+. I'm not sure what made the integrated controller on Zen so finicky compared to Intel's IMC, but this move will at the very least allow AMD to bin memory controllers if they want to or maybe work around some issues with their design.
how often can 64 cores be effectively utilized without bumping up on memory throughput as a limiter?
This is pretty challenging at 32 cores! I know these chips ship with big l3 cache but l3 cache isn't so fast either.
I'm wondering if we will ever reach a point at which even interpreted languages will be bottlenecked on memory bandwidth rather than cache misses.
Fictional process geometries are the new MHz wars.
Is that really the case? I'm not seeing process node sizes being plastered over computers as a selling point.
It's being mentioned more during product launches than it used to be. For example, Apple heavily promoted the iPhone XS's 7nm SOC where previously it wouldn't have been mentioned.
I would guess this is due to other vendors starting to surpass Intel and wanting to highlight their process lead.
Is anyone knowledgeable to comment about the memory bandwidth. I thought Zen-1 was eight channel with 32 core, now the Zen-2 is the same eight channel with 64 core. Wouldn't that cause issue or can the new memory system be that better?
The other interesting thing is that they said the memory access would be more uniform kind of like NUMA independent given that the controller is no longer part of the individual chip but a common element. Which definitely makes good performance easier with such a beast of cheap but does it do so at the cost of the lowest possible latency as in when in Zen-1 the memory access was from a channel in the same CPU. I would hope that a massive single piece of IO chip would allow them to design the thing better but does anyone know or care to guess?
Probably bodes for worse memory performance for (most) VM hosts and better memory performance for (most) bare metal workloads. I don't think anyone is concerned with many-VM class workloads not having the highest possible memory throughput though and I doubt going to even 16 channels would make a big differences anyways. It'll be somewhat easy to find out if memory bandwidth needs to scale linearly or you hit massive perf losses in the real world though as Intel's competing 2x24core die was announced to have 12 channels.
There was a longstanding rumor that the IO chip was going to have ~512mb of l4 cache. Considering it wasn't announced I'm thinking that turned out to not be true but from a pure performance perspective that is probably more useful than a couple more memory channels (though likely more complicated).
I'm really curious about cache on the IO die. Ian Cutress of Anandtech said Rome is ~ 1000 mm² in total.  Based on that and the pictures of the Rome die that were shown some users  estimated the size of the IO die to be 387 - 407 mm².
- Zen 1 (8 cores) with IO is 213 mm².
- the Zen 2 core only chiplets are estimated to be around 70 mm².
- if we assume IO scales as well as the rest (which it doesn't) Zen 1 would be ~106 mm² on 7nm.
- let's just say the difference between Zen 2 core only chiplets and the imaginary Zen 1 on 7nm is the size of the IO per Zeppelin die => ~36 mm²
- now double the area again because the IO die is on 14nm => 72 mm²
- now quadruple the size because we have 8 memory channels and 128 PCIe 4.0 lanes => 288 mm²
Going by my flawed layman estimation this would mean we still have a budget of ~100 mm² for additional functionality. Either PCIe 4.0 takes much more space than PCIe 3.0, they have some secret sauce in there, or maybe just a large L4 cache.
If they use EDRAM instead of SRAM like Intel did with some of their Broadwell and Skylake CPUs they could probably fit quite a bit cache in this area. Intel used 128 MB EDRAM fabbed on a 22nm node which required 84mm ² 
I have a feeling this is not a AMD vs Intel or Apple vs Intel.
This is TSMC vs Intel. TSMC basically make the 7nm chip for Apple and AMD.
It looks like this company HQ at Taiwan provided the bragging right for Apple and AMD…
I don't think it's one or the other rather both. TSMC is definitely starting to lead the foundries but e.g. nobody expects the next gen Qualcomm Snapdragon chip to beat the Apple A12X even though both are coming out of TSMC.
IMO Intel is lagging on both fronts, AMD is catching Intel a bit, ARM is steamrolling year/year perf increases compared to x86, and Apple remains +25% ahead of every other ARM chip.
Well, looks like Semiaccurate was right about Rome's architecture. Personally I'm excited about the wider vector units.
Dr. Lisa Su is so inspiring when she speaks - very clear, sure of her product and confident. Hopefully I get there one day!
Non-AMP link: https://www.tomshardware.com/news/amd-new-horizon-7nm-cpu,38...
Good call, thanks
I actually prefer the amp link...
The original AMP link, for those looking for it: https://amp.tomshardware.com/news/amd-new-horizon-7nm-cpu,38...
nm is the new megapixel
It's the Mhz. Intel went down the Mhz rabbit hole with NetBurst and it gave AMD a temporary advantage (P4 vs. Athlon 64). Are we seeing the same thing again with 7nm vs. 10nm?