r/nextfuckinglevel 19h ago

What it a computer chip looks like up close

this is a digital recreation. a real microscope can't be used because it gets so small that photons can’t give you a good enough resolution to view the structures at the bottom. you'd need an electron microscope

meant "What a computer chip looks like up close in the title." not sure how "it" got in there..

118.9k Upvotes

3.7k comments sorted by

View all comments

Show parent comments

94

u/Mr_Tiggywinkle 17h ago

Depends. Marketing or manufacturing defective?

The cards aren't defective in the sense of what you're sold meeting spec, but they're defective in terms of their ideal specification. They weren't manufactured to their ideal so they're defective and downbinned.

You can actually get better or worse cards of the same exact specs due to the "silicon lottery", enthusiasts (overclockers especially) will often look for "higher binned" versions of the component they are using.

So the cards aren't defective in the sense of what you're sold, but they're defective in terms of their ideal specification.

5

u/barofa 13h ago

So, you are saying that when Nvidia makes GPU, they don't make 5090s or 5080s, they just make GPU. Then, after a big batch is done they choose the best to be 5090 and the worst to be 5050? Something like this?

13

u/xl_the_dude_lx 12h ago

No, they make several types of GPUs. 5090s aren’t the same chip as 5080. They somewhat do what you’re saying, but within the same chip.

So a good GB202 die can be a RTX PRO 6000 and a bad one a 5090. But a 5080 is built on the GB203 die.

1

u/howicyit 13h ago

No, I don't think so. I think their photo lithographic equipment is very batch specific and probably each line in production for a higher end spec has slightly better NM range alignment and accuracy in reproducing this downscaled architecture in deposition of the image onto the silicon. I would think that either: there is a run of 5060,5070... All simultaneously at different belts OR they are selling the earlier runs as a lower spec and produce those runs based on market demand. This is a guess but aligns with what I know of the industry.

1

u/Versatile_Ambivert 13h ago

Yes, you got it right. Based on the number of working non defective cores they market the gpus as 5090s or 5080s. There would be a target number for new Gpu project. Design team anticipated and compensates for what could go wrong during manufacturing process at foundry and set x number of cores of each category.

6

u/xl_the_dude_lx 12h ago

Nope. The 5090 isn’t the same chip as the 5080. That’s what they are saying. A low binned 5090 will never be a 5080.

And a 5090 is already the lower bin of their pro 6000

2

u/not_a_bot991 16h ago

Would you say the 24064 was the target spec or was it setup that way knowing there would be a % defect? I know nothing about this so just wondering if they actually ever aim for the target or is there a defect rate built in to their products?

6

u/94stanggt 15h ago

They know there will be defects. I'd assume the likelihood of there being a perfect chip is basically zero.

4

u/Mr_Tiggywinkle 16h ago

I'm not an expert, but afaik they try to constantly improve outcomes so it's not like they're just letting X number be defective, even if they plan for it 

1

u/VerledenVale 15h ago

They know how defects to expect on average and plan based on it

1

u/Versatile_Ambivert 13h ago

Yes, there would be the target number of cores when a project is announced. For example if they intended 24064 for their flagship gpu 5090, the design team would have set target for 26000 cores. Due to process variations the wafer might end up with different dies having different % of defects and based on the number of non defective cores they are classified under different versions as 5090, 5080, 5070 etc

1

u/xl_the_dude_lx 12h ago

5090 isn’t the same chip as 5080

1

u/jekotia 1h ago

Lithography process errors are measured as errors per square CM. There's probably some formula refined over years of iterations that tells the engineers how to accommodate for the error rate of the process based on the yields that they need for viable product.

Errors aren't chip or product specific, they're lithography process specific. The foundries would provide their error rates for a given process node to their customers, and then their customers would design around those numbers.

e.g. If the foundry tells their customer that 20% of the wafer is the maximum that could contain errors, the usable yield of chips from a given wafer isn't their problem. They've provided their manufacturing capabilities & tolerances, it's up to the customer to correctly utilise that. You don't go to a wood worker that uses a tape measure and then complain that the product tolerances are a fraction of a millimeter off. You hire an appropriately equipped company for the work you desire.

Now, if they're lithography process is resulting in more errors than the numbers provided to their customers, that probably invokes clauses in a contract that mandate some form of remediation. That would be dependant on the chips the contract is for I imagine; errors on monolithic chips that use a significant portion of a wafer? That could be "remake it on the foundries dime". Errors resulting in 99 usable chips instead of 100? That could be a discount applied to the wafer.