An interesting article in PC Gamer states that the 13 and 14 generation Intel CPU chips are failing at a huge rate.
"Shockingly, the game studio [Alderon Games] also says that "the failure rate we have observed from our own testing is nearly 100%, indicating it's only a matter of time before affected CPUs fail.""
"But if things are pointing away from power and temperature being the culprit for these Intel CPU crashes, as does seem to be the case, and if things continue pointing towards chip degradation, we may very well be looking at something else. "
What they call degradation could be due to power and temperature problems. Normally chips are designed with a base level of reliability, typically running at 100C 24hours a day for 5 years. If Intel is having manufacturing problems, they might not be meeting a spec like this, leading to chips failing when under heavy loads for long periods of time, like servers. Users might not see this, since few users run their machines at 100% for days, weeks, or months at a time.
At 5nm and below, the rules for reliability (especially for the metal layers) become very complicated. Temperature is a huge factor. A chip can pass all the normal tests, and still fail later due to metal fatigue and degradation. This is the chip maker's worst nightmare, since the failures occur in the field at some later date. If Intel doesn't have the proper tools and manufacturing data, and their reliability is not fully verified, this would explain what the customers are seeing.
Dino survival game developer is switching all its servers to AMD 'which experience 100 times fewer crashes compared to Intel' because it's 'only a matter of time before affected CPUs fail' | PC Gamer
------------
2024: 47 years on the Net.
Comments
"Dylan Browne, an Unreal Engine Supervisor and Feature Film VFX at the ModelFarm visual effects studio, posted on X that his company is experiencing a 50% failure rate for systems powered by Intel's Core i9-13900K and 14900K processors."
Intel announced a microcode fix coming sometime in August, but:
"(This isn't a 'fix' for CPUs experiencing the issue — impacted processors are irreversibly damaged and must be replaced.)"
This is obviously a reliability problem, and once the chip is affected it is dead forever.
Unreal Engine supervisor at ModelFarm blasts 50% failure rate with Intel chips — company switching to AMD's Ryzen 9 9950X, praises single-threaded performance (msn.com)
------------
2024: 47 years on the Net.
"So, problem solved? Unfortunately, no. There's much more to this mess than Intel suggested last week. In an interview with The Verge, Intel communications manager Thomas Hannaford explained that the bug affects many more chips than previously known. The flaw is present in all Raptor Lake and Raptor Lake Refresh chips with TDPs of 65W or higher. This includes the enthusiast K/KF/KS CPUs with unlocked multipliers, as well as the non-K variants for mainstream use. So, Core i5, i7, and i9 CPUs are all in the mix now."
Intel CPU Crashing Bug Affects Many More Chips Than We Thought (msn.com)
Intel isn't recalling the chips, and it's not clear if they will replace the chips that failed. Those are ruined forever.
"Intel has not halted sales or clawed back any inventory. It will not do a recall, period. The company is not currently commenting on whether or how it might extend its warranty. It would not share estimates with The Verge of how many chips are likely to be irreversibly impacted, and it did not explain why it’s continuing to sell these chips ahead of any fix."
------------
2024: 47 years on the Net.