GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

Smaug's Smokey Hole · Mar 20, 2021

3119967d0c said:
I'm looking at broadly similar miniPCs like the Asus PN50 which will- before you spend the extra money to add RAM and storage, which, you know, come with the Mac mini- come in at $150 or so less at best.

The M1 is faster in single and multi-threaded CPU tasks. And it destroys AMD and Intel integrated GPU options.

And.. and I know this is an alien notion to people who want to buy a fast Ryzen or any sort of modern GPU... it is available to buy. At RRP.

EDIT: I will be the first in line to throw Tim Cook off a tall building, but Apple has some good engineers, and they absolutely killed it on the M1. I am really interested to see how they do in regards allowing for off-package memory (I bet they can stretch to 32gb from 16gb in a bigger footprint, but who knows what that would do to yields- although the CPU is not actually in the same chip at the memory in the M1, just soldered onto the same package) and external GPUs in the successors. I think that they probably need to do both to allow for certain AV related tasks, the only question is what it will do to memory access speed, which is obviously the source of a lot of their really stellar metrics from the M1.

They reportedly get 68GB/s out of that memory compared to 49GBs for a Ryzen using DDR4 3200. But the M1 have to share that with the GPU. Putting the memory on the package is weird, maybe that is just temporary and inherited from the cellphone design. They might go down the route of graphics cards and consoles and put it on the board instead while still having a unified memory architecture with massive memory bandwidth. Or maybe they're thinking about HBM down the line if prices and availability starts to make sense.

The Mass Shooter Ron Soye · Mar 20, 2021

Smaug's Smokey Hole said:
Putting the memory on the package is weird, maybe that is just temporary and inherited from the cellphone design.

Doesn't this decision only help the M1's performance? Getting more RAM physically closer to the CPU and GPU is going to be one of the main ways to improve performance going forward.

Smaug's Smokey Hole · Mar 20, 2021

The Mass Shooter Ron Soye said:
Doesn't this decision only help the M1's performance? Getting more RAM physically closer to the CPU and GPU is going to be one of the main ways to improve performance going forward.

It doesn't have to be on the SoC package, especially for what looks to be a 64bit interface, memory is often put on the package to achieve a memory interface that includes the word "thousand". My uneducated guess is that it's a holdover from mobile, for now. They're using low-power DDR4 so there's nothing exotic about that either. Maybe they could have gone for GDDR but that might have affected CPU performance(it's probably not available in modules that large either).

Here's the M1 CPU/GPU package with the RAM on the right(trace the raised green PCB outline around the memory and heatspreader). Looks wonky, right?

Macbook board(according to google image search).

moocow · Mar 20, 2021

The Mass Shooter Ron Soye said:
NVIDIA attempts to close the stable door after the horse has bolted by pulling GeForce driver that disables crypto mining limiter for the GeForce RTX 3060

Nvidia defeated their own mining limiter.

lol those dumb motherfuckers. I just knew they'd screw this up and shoot an own-goal. The very notion of trying to control this in software was just retarded in the first place. They've got the same kind of software limiter on the drivers to prevent hardware video encoding of more than two streams simultaneously on their consumer-grade cards too (even though even the 10xx series hardware can easily support 4 or 5 1080p 60fps streams at once with no trouble).

That encoding limit can be removed with an eight-byte patch to the driver. I laughed my ass off when I discovered that. Fucking idiots. I bet this limit can be removed just as easily once people figure out what bits to toggle.

Mar 20, 2021

Smaug's Smokey Hole said:
It doesn't have to be on the SoC package, especially for what looks to be a 64bit interface, memory is often put on the package to achieve a memory interface that includes the word "thousand". My uneducated guess is that it's a holdover from mobile, for now. They're using low-power DDR4 so there's nothing exotic about that either. Maybe they could have gone for GDDR but that might have affected CPU performance(it's probably not available in modules that large either).

Here's the M1 CPU/GPU package with the RAM on the right(trace the raised green PCB outline around the memory and heatspreader). Looks wonky, right?
View attachment 2013555

Macbook board(according to google image search).
View attachment 2013564

Hah. The funny thing about that photo of the board, while it does look a little 'odd' in that they didn't put a matching heat spreader over the RAM modules, you can tell it's set up to look as 'balanced' as possible in how it faces whatever chips are sitting above it.

I would love to see a really sound engineering analysis of this particular design choice. It wasn't so long ago that we had Pentium Pros running sub-MB levels of in-chip L2 cache at their actual speeds (admittedly, only a couple hundred MHz) or P2's with 256-512 kb of on-package L2 at half speed. Obviously this memory isn't operating up at that sort of multiplier. So where is the performance coming from? Are typical Northbridges really that much of a hindrance on performance?

The Mass Shooter Ron Soye · Mar 20, 2021

moocow said:
lol those dumb motherfuckers. I just knew they'd screw this up and shoot an own-goal. The very notion of trying to control this in software was just retarded in the first place. They've got the same kind of software limiter on the drivers to prevent hardware video encoding of more than two streams simultaneously on their consumer-grade cards tpp (even though even the 10xx series hardware can easily support 4 or 5 1080p 60fps streams at once with no trouble).

That encoding limit can be removed with an eight-byte patch to the driver. I laughed my ass off when I discovered that. Fucking idiots. I bet this limit can be removed just as easily once people figure out what bits to toggle.

This story came out more recently:

Crypto Miners Fool Nvidia's Anti-Mining Limiter With $6 HDMI Dummy Plug

As odd as it may sound, Nvidia gave away the keys to its own kingdom when the chipmaker accidentally released a GeForce beta driver that disabled the limiter.

However, the beta driver doesn't completely unlock Ethereum mining as there are still some restrictions present. For starters, the driver supposedly limits the mining activities to one GeForce RTX 3060. It does this by requiring the graphics card needed to communicate with the motherboard through a PCIe 3.0 x8 interface as a minimum, meaning PCIe x1 risers are useless. Furthermore, a monitor has to be connected to the GeForce RTX 3060 via the HDMI port or DisplayPort output.

Nvidia's conditions aren't as demanding as they may sound. The PCIe 3.0 x8 requirement only means that you'll need to pick up a motherboard that has sufficient PCIe 3.0 x8 slots to house the number of GeForce RTX 3060 that you plan to stick in it. The second requisite seems expensive since you'd need to connect a monitor to each GeForce RTX 3060. However, Nvidia's driver isn't as smart as the chipmaker makes it out to be \u2014 The driver detects if a monitor is connected to the graphics card, but it can't tell the difference if it's a real display or not. Therefore, an HDMI dummy plug, which retails for as low as $5.99 on Amazon, easily tricks Nvidia's driver into thinking that a display is effectively present when in reality, it isn't.

Smaug's Smokey Hole · Mar 21, 2021

3119967d0c said:
Hah. The funny thing about that photo of the board, while it does look a little 'odd' in that they didn't put a matching heat spreader over the RAM modules, you can tell it's set up to look as 'balanced' as possible in how it faces whatever chips are sitting above it.

I would love to see a really sound engineering analysis of this particular design choice. It wasn't so long ago that we had Pentium Pros running sub-MB levels of in-chip L2 cache at their actual speeds (admittedly, only a couple hundred MHz) or P2's with 256-512 kb of on-package L2 at half speed. Obviously this memory isn't operating up at that sort of multiplier. So where is the performance coming from? Are typical Northbridges really that much of a hindrance on performance?

Yeah I really don't know how they get 69GB out of that, I can't imagine that they run at insane speed and that's why I used a Ryzen with DDR4 3200 as a comparison. I'm too lazy to do the math but I would guesstimate it to the equivalent of DDR4 4200 numbers which is pretty insane(edit: for fucks sake, I looked up chip and package size and saw that it uses DDR4 4266). Looking up the modules themselves they're SK hynix. Can't find ANY information on them and the shape and size is a bit funky, it could be a custom packaging housing more modules than we can see.

It's the embedding that confuses me, it feels like putting a spoiler on a golf cart. There's not a lot of room on the board but on everything before this they managed to squeeze in a more conventional solution. I don't think they're so cheap that they're afraid of maybe adding another layer to the board either if they absolutely had to.

MarvinTheParanoidAndroid · Mar 21, 2021

moocow said:
lol those dumb motherfuckers. I just knew they'd screw this up and shoot an own-goal. The very notion of trying to control this in software was just retarded in the first place. They've got the same kind of software limiter on the drivers to prevent hardware video encoding of more than two streams simultaneously on their consumer-grade cards too (even though even the 10xx series hardware can easily support 4 or 5 1080p 60fps streams at once with no trouble).

Why are they putting arbitrary limits on their own hardware?

Smaug's Smokey Hole · Mar 21, 2021

MarvinTheParanoidAndroid said:
Why are they putting arbitrary limits on their own hardware?

Quadro. They gimp consumer cards in several ways.

MarvinTheParanoidAndroid · Mar 21, 2021

Smaug's Smokey Hole said:
Quadro. They gimp consumer cards in several ways.

So their enterprise chips are the same shit as their consumer chips, and they're hobbling consumer chips because they want professionals buying the enterprise?

Smaug's Smokey Hole · Mar 21, 2021

MarvinTheParanoidAndroid said:
So their enterprise chips are the same shit as their consumer chips, and they're hobbling consumer chips because they want professionals buying the enterprise?

Yep. (edit: well, not exactly, really expensive Quadros have some important differences) It wasn't like that in the beginning and I've sperged about it before. Back when the GeForce 256 was released you could buy one for $299 and have a fucking stellar card for Maya, the recently released successor of PowerAnimator and the next generation of 3D software, available on Windows unlike PowerAnimator! XSI, the competition so to say, was also released around that time. You could build a great 3D workstation for next to nothing(compared to conventional pricing) and the PC graphics evolution was part of the reason Silicon Graphics rapidly declined, they were not needed anymore. It was a brave new world. Wild times.

Then they hobbled the GF256 drivers and like the 3060 drivers that could be easily fixed.

Then with the Geforce 2 they put in special hardware(a resistor IIRC) to separate the consumer and professional line(Quadro) and that could be easily fixed.

Right now you can buy a used GT 1030 equivalent in Quadro branding for $100 that lets you run four 5k monitors at once via mini displayport. They allow 10bit color on consumer cards now, that's an artificial restriction they lifted when HDR arrived.

The Real SVP · Mar 21, 2021

Smaug's Smokey Hole said:
It's the embedding that confuses me,

They want to use the same package on mobile devices.

LegoTugboat · Mar 21, 2021

I'd say to assume the M1 graphics to be at above GeForce 1050 level for 7C/8C with 8 GB. For 8C with 16 GB, it's harder to tell, but I'd say between a 1060 6 GB and a 1070. We'll probably have to wait for Baldurs Gate 3 and Metro Exodus to come out to be sure.

It's quite the interesting chip, even though the general consensus is "Apple did it... but we're not sure how."
That being said, for gaming Mac isn't the best platform, unless you like indie stuff.

There's rumours, quite a few, going around about a M1X, which is supposed to be for a 14 inch MBP, top end Mini, and 21 and 24 inch iMac though, and if the rumours are true, that's supposed to be packing 12 cores, up from the 7/8 of the M1, and giving a 40-50% boost.

If that thing launches and graphics card prices are still ludicrous, it could be a very interesting dilemma.

AmpleApricots · Mar 21, 2021

Always remember that Apple can and will deem hardware like this "obsolete" a few years down the road and you'll stop getting updates and they might even sabotage how well the hardware is running/how useful it will be. Since this thing will forever stay incredibly locked down, you won't even be able to install anything else on it and will forever be at the mercy of Apple for everything. Don't be hopeful for Linux "fixing" this, it might eventually, some day, maybe, boot a kernel but I can gurantee you that the GPU will never be supported. If you think Apple might change their minds about maybe releasing specs to kernel developers to make it compatible when this thing is old some day let me remind you that they never even did that for their 68k Macs from the 90s.

The SoC is cool but IMHO, that alone makes it a non-starter.

Ginger Piglet · Mar 21, 2021

AmpleApricots said:
Always remember that Apple can and will deem hardware like this "obsolete" a few years down the road and you'll stop getting updates and they might even sabotage how well the hardware is running/how useful it will be. Since this thing will forever stay incredibly locked down, you won't even be able to install anything else on it and will forever be at the mercy of Apple for everything. Don't be hopeful for Linux "fixing" this, it might eventually, some day, maybe, boot a kernel but I can gurantee you that the GPU will never be supported. If you think Apple might change their minds about maybe releasing specs to kernel developers to make it compatible when this thing is old some day let me remind you that they never even did that for their 68k Macs from the 90s.

The SoC is cool but IMHO, that alone makes it a non-starter.

Also, daily reminder that Apple deliberately released a software update for iPhones that checked to see if it had been third party repaired and then bricked them. It's not inconceivable that they will try that again. After all, "Error 57" hurt their sales by... absolutely nothing.

moocow · Mar 21, 2021

AmpleApricots said:
Always remember that Apple can and will deem hardware like this "obsolete" a few years down the road and you'll stop getting updates and they might even sabotage how well the hardware is running/how useful it will be. Since this thing will forever stay incredibly locked down, you won't even be able to install anything else on it and will forever be at the mercy of Apple for everything. Don't be hopeful for Linux "fixing" this, it might eventually, some day, maybe, boot a kernel but I can gurantee you that the GPU will never be supported. If you think Apple might change their minds about maybe releasing specs to kernel developers to make it compatible when this thing is old some day let me remind you that they never even did that for their 68k Macs from the 90s.

The SoC is cool but IMHO, that alone makes it a non-starter.

Yup. Even if Apple had a lucid moment enough to actually innovate something and do a good job of it, it's still Apple. Walled-garden, locked down hard, closed source and proprietary software, no external documentation or access to knowledge about internals, restricted development ecosystem and toolchains, and often very active and downright hostile actions meant to thwart curious third parties from digging too deep and/or "breaking the hardware free" of its various chains.

Fuck them and their tech, even if it's occasionally nifty.

Just Some Other Guy · Mar 21, 2021

moocow said:
Yup. Even if Apple had a lucid moment enough to actually innovate something and do a good job of it, it's still Apple. Walled-garden, locked down hard, closed source and proprietary software, no external documentation or access to knowledge about internals, restricted development ecosystem and toolchains, and often very active and downright hostile actions meant to thwart curious third parties from digging too deep and/or "breaking the hardware free" of its various chains.

Fuck them and their tech, even if it's occasionally nifty.

"But dood! Didn't you see the benchmark performance of the Apple created hardware running inside of the Apple controlled environment!"

It better be damn fast for everything that you lose.

Ginger Piglet · Mar 22, 2021

Just Some Other Guy said:
"But dood! Didn't you see the benchmark performance of the Apple created hardware running inside of the Apple controlled environment!"

It better be damn fast for everything that you lose.

Well this is it. Apple came out with "M1 is faster than 98 percent of all other laptops!" but failed to mention that they firstly only used Geekbench for that, and cherry picked the use cases. It's also probable that they did something equivalent to what Microsoft did with the AARD code back in Windows 3.1 days.

The AARD code was a flag that Microsoft put in Windows 3.1 to deliberately gimp the performance of systems not running Windows 3.1 atop MS-DOS. It checked to see at loading to see if it had DR-DOS installed and if it did, it deliberately locked out Windows, and programs running under it, from certain APIs and system calls so it would chuck errors it didn't need to. Then, people would ring the tech support line and ask, "hey, what's wrong with Windows?" and when they said they were using DR-DOS, they would be given a load of technobabble about how DR-DOS wasn't totally compatible with Windows and if they only used MS-DOS instead, they wouldn't have this problem. It was one of the reasons why Microsoft got his with anti-trust charges in the late 1990s.

Smaug's Smokey Hole · Mar 22, 2021

Ginger Piglet said:
Well this is it. Apple came out with "M1 is faster than 98 percent of all other laptops!" but failed to mention that they firstly only used Geekbench for that, and cherry picked the use cases. It's also probable that they did something equivalent to what Microsoft did with the AARD code back in Windows 3.1 days.

I read one interesting article a while back that had a big thunk moment. The M1 is 1 core, 1 thread, and in single core benchmarks with Intel/AMD they only run 1 core/1 thread even though there are legit performance reasons for letting SMT-enabled CPUs run both threads for tasks that aren't explicitly multithreaded, that's why they exist. The difference between a Core i5 and Core i7 used to be that one had twice as many logical processors but the same amount of cores and that made it run Counterstrike or some other ancient single-threaded monolith faster.

[in Cinebench] We saw between 20% to 30% improvement in "single-core" results while allowing x86 SMT-based processors to utilize the second thread associated with the same core. For those interested, Geekbench also saw an average of 20-25% improvement with the same technique.

Exclusive: Why Apple M1 Single "Core" Comparisons Are Fundamentally Flawed (With Benchmarks)

I have something pretty exciting for our readers today; something that almost everyone appears to have missed in the clamor for Apple M1 benchmark comparisons. What if I told you that pretty much all of the single-core benchmark comparisons between the Apple M1 and modern x86 processors you see...

wccftech.com

Ginger Piglet · Mar 22, 2021

Smaug's Smokey Hole said:
I read one interesting article a while back that had a big thunk moment. The M1 is 1 core, 1 thread, and in single core benchmarks with Intel/AMD they only run 1 core/1 thread even though there are legit performance reasons for letting SMT-enabled CPUs run both threads for tasks that aren't explicitly multithreaded, that's why they exist. The difference between a Core i5 and Core i7 used to be that one had twice as many logical processors but the same amount of cores and that made it run Counterstrike or some other ancient single-threaded monolith faster.

Exclusive: Why Apple M1 Single "Core" Comparisons Are Fundamentally Flawed (With Benchmarks)

I have something pretty exciting for our readers today; something that almost everyone appears to have missed in the clamor for Apple M1 benchmark comparisons. What if I told you that pretty much all of the single-core benchmark comparisons between the Apple M1 and modern x86 processors you see...

wccftech.com

And as expected, the comments are full of word salad copium from Apple fanboys.

The thing is, let's be honest, most Macbook Air M1 users are going to be sneering hipster cunts who pose with it in coffee shops and occasionally do some light photoshopping.

GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

Smaug's Smokey Hole

Closed for summer

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

Smaug's Smokey Hole

Closed for summer

moocow

Moo.

⠠⠠⠅⠑⠋⠋⠁⠇⠎ ⠠⠠⠊⠎ ⠠⠠⠁ ⠠⠠⠋⠁⠛

WHO DARES BATTLE THE SARACEN

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

Smaug's Smokey Hole

Closed for summer

MarvinTheParanoidAndroid

This will all end in tears, I just know it.

Smaug's Smokey Hole

Closed for summer

MarvinTheParanoidAndroid

This will all end in tears, I just know it.

Smaug's Smokey Hole

Closed for summer

The Real SVP

LegoTugboat

AmpleApricots

Ginger Piglet

Burglar of Jess Phillips MP

moocow

Moo.

Just Some Other Guy

Ginger Piglet

Burglar of Jess Phillips MP

Smaug's Smokey Hole

Closed for summer

Exclusive: Why Apple M1 Single "Core" Comparisons Are Fundamentally Flawed (With Benchmarks)

Ginger Piglet

Burglar of Jess Phillips MP

Exclusive: Why Apple M1 Single "Core" Comparisons Are Fundamentally Flawed (With Benchmarks)