GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

I'm looking at broadly similar miniPCs like the Asus PN50 which will- before you spend the extra money to add RAM and storage, which, you know, come with the Mac mini- come in at $150 or so less at best.

The M1 is faster in single and multi-threaded CPU tasks. And it destroys AMD and Intel integrated GPU options.

And.. and I know this is an alien notion to people who want to buy a fast Ryzen or any sort of modern GPU... it is available to buy. At RRP.

EDIT: I will be the first in line to throw Tim Cook off a tall building, but Apple has some good engineers, and they absolutely killed it on the M1. I am really interested to see how they do in regards allowing for off-package memory (I bet they can stretch to 32gb from 16gb in a bigger footprint, but who knows what that would do to yields- although the CPU is not actually in the same chip at the memory in the M1, just soldered onto the same package) and external GPUs in the successors. I think that they probably need to do both to allow for certain AV related tasks, the only question is what it will do to memory access speed, which is obviously the source of a lot of their really stellar metrics from the M1.
They reportedly get 68GB/s out of that memory compared to 49GBs for a Ryzen using DDR4 3200. But the M1 have to share that with the GPU. Putting the memory on the package is weird, maybe that is just temporary and inherited from the cellphone design. They might go down the route of graphics cards and consoles and put it on the board instead while still having a unified memory architecture with massive memory bandwidth. Or maybe they're thinking about HBM down the line if prices and availability starts to make sense.
 
  • Informative
Reactions: thejackal
Doesn't this decision only help the M1's performance? Getting more RAM physically closer to the CPU and GPU is going to be one of the main ways to improve performance going forward.
It doesn't have to be on the SoC package, especially for what looks to be a 64bit interface, memory is often put on the package to achieve a memory interface that includes the word "thousand". My uneducated guess is that it's a holdover from mobile, for now. They're using low-power DDR4 so there's nothing exotic about that either. Maybe they could have gone for GDDR but that might have affected CPU performance(it's probably not available in modules that large either).

Here's the M1 CPU/GPU package with the RAM on the right(trace the raised green PCB outline around the memory and heatspreader). Looks wonky, right?
Apple-M1-Chip.jpg

Macbook board(according to google image search).
2020-11-20-08-11-01-2.jpg
 
lol those dumb motherfuckers. I just knew they'd screw this up and shoot an own-goal. The very notion of trying to control this in software was just retarded in the first place. They've got the same kind of software limiter on the drivers to prevent hardware video encoding of more than two streams simultaneously on their consumer-grade cards too (even though even the 10xx series hardware can easily support 4 or 5 1080p 60fps streams at once with no trouble).

That encoding limit can be removed with an eight-byte patch to the driver. I laughed my ass off when I discovered that. Fucking idiots. I bet this limit can be removed just as easily once people figure out what bits to toggle.
 
Last edited:
It doesn't have to be on the SoC package, especially for what looks to be a 64bit interface, memory is often put on the package to achieve a memory interface that includes the word "thousand". My uneducated guess is that it's a holdover from mobile, for now. They're using low-power DDR4 so there's nothing exotic about that either. Maybe they could have gone for GDDR but that might have affected CPU performance(it's probably not available in modules that large either).

Here's the M1 CPU/GPU package with the RAM on the right(trace the raised green PCB outline around the memory and heatspreader). Looks wonky, right?
View attachment 2013555

Macbook board(according to google image search).
View attachment 2013564
Hah. The funny thing about that photo of the board, while it does look a little 'odd' in that they didn't put a matching heat spreader over the RAM modules, you can tell it's set up to look as 'balanced' as possible in how it faces whatever chips are sitting above it.

I would love to see a really sound engineering analysis of this particular design choice. It wasn't so long ago that we had Pentium Pros running sub-MB levels of in-chip L2 cache at their actual speeds (admittedly, only a couple hundred MHz) or P2's with 256-512 kb of on-package L2 at half speed. Obviously this memory isn't operating up at that sort of multiplier. So where is the performance coming from? Are typical Northbridges really that much of a hindrance on performance?
 
  • Thunk-Provoking
Reactions: Smaug's Smokey Hole
lol those dumb motherfuckers. I just knew they'd screw this up and shoot an own-goal. The very notion of trying to control this in software was just retarded in the first place. They've got the same kind of software limiter on the drivers to prevent hardware video encoding of more than two streams simultaneously on their consumer-grade cards tpp (even though even the 10xx series hardware can easily support 4 or 5 1080p 60fps streams at once with no trouble).

That encoding limit can be removed with an eight-byte patch to the driver. I laughed my ass off when I discovered that. Fucking idiots. I bet this limit can be removed just as easily once people figure out what bits to toggle.

This story came out more recently:

Crypto Miners Fool Nvidia's Anti-Mining Limiter With $6 HDMI Dummy Plug

As odd as it may sound, Nvidia gave away the keys to its own kingdom when the chipmaker accidentally released a GeForce beta driver that disabled the limiter.

However, the beta driver doesn't completely unlock Ethereum mining as there are still some restrictions present. For starters, the driver supposedly limits the mining activities to one GeForce RTX 3060. It does this by requiring the graphics card needed to communicate with the motherboard through a PCIe 3.0 x8 interface as a minimum, meaning PCIe x1 risers are useless. Furthermore, a monitor has to be connected to the GeForce RTX 3060 via the HDMI port or DisplayPort output.

Nvidia's conditions aren't as demanding as they may sound. The PCIe 3.0 x8 requirement only means that you'll need to pick up a motherboard that has sufficient PCIe 3.0 x8 slots to house the number of GeForce RTX 3060 that you plan to stick in it. The second requisite seems expensive since you'd need to connect a monitor to each GeForce RTX 3060. However, Nvidia's driver isn't as smart as the chipmaker makes it out to be \u2014 The driver detects if a monitor is connected to the graphics card, but it can't tell the difference if it's a real display or not. Therefore, an HDMI dummy plug, which retails for as low as $5.99 on Amazon, easily tricks Nvidia's driver into thinking that a display is effectively present when in reality, it isn't.
 
Hah. The funny thing about that photo of the board, while it does look a little 'odd' in that they didn't put a matching heat spreader over the RAM modules, you can tell it's set up to look as 'balanced' as possible in how it faces whatever chips are sitting above it.

I would love to see a really sound engineering analysis of this particular design choice. It wasn't so long ago that we had Pentium Pros running sub-MB levels of in-chip L2 cache at their actual speeds (admittedly, only a couple hundred MHz) or P2's with 256-512 kb of on-package L2 at half speed. Obviously this memory isn't operating up at that sort of multiplier. So where is the performance coming from? Are typical Northbridges really that much of a hindrance on performance?
Yeah I really don't know how they get 69GB out of that, I can't imagine that they run at insane speed and that's why I used a Ryzen with DDR4 3200 as a comparison. I'm too lazy to do the math but I would guesstimate it to the equivalent of DDR4 4200 numbers which is pretty insane(edit: for fucks sake, I looked up chip and package size and saw that it uses DDR4 4266). Looking up the modules themselves they're SK hynix. Can't find ANY information on them and the shape and size is a bit funky, it could be a custom packaging housing more modules than we can see.

It's the embedding that confuses me, it feels like putting a spoiler on a golf cart. There's not a lot of room on the board but on everything before this they managed to squeeze in a more conventional solution. I don't think they're so cheap that they're afraid of maybe adding another layer to the board either if they absolutely had to.
 
Last edited:
lol those dumb motherfuckers. I just knew they'd screw this up and shoot an own-goal. The very notion of trying to control this in software was just retarded in the first place. They've got the same kind of software limiter on the drivers to prevent hardware video encoding of more than two streams simultaneously on their consumer-grade cards too (even though even the 10xx series hardware can easily support 4 or 5 1080p 60fps streams at once with no trouble).
Why are they putting arbitrary limits on their own hardware?
 
  • Thunk-Provoking
Reactions: moocow
Quadro. They gimp consumer cards in several ways.
So their enterprise chips are the same shit as their consumer chips, and they're hobbling consumer chips because they want professionals buying the enterprise?
 
So their enterprise chips are the same shit as their consumer chips, and they're hobbling consumer chips because they want professionals buying the enterprise?
Yep. (edit: well, not exactly, really expensive Quadros have some important differences) It wasn't like that in the beginning and I've sperged about it before. Back when the GeForce 256 was released you could buy one for $299 and have a fucking stellar card for Maya, the recently released successor of PowerAnimator and the next generation of 3D software, available on Windows unlike PowerAnimator! XSI, the competition so to say, was also released around that time. You could build a great 3D workstation for next to nothing(compared to conventional pricing) and the PC graphics evolution was part of the reason Silicon Graphics rapidly declined, they were not needed anymore. It was a brave new world. Wild times.

Then they hobbled the GF256 drivers and like the 3060 drivers that could be easily fixed.

Then with the Geforce 2 they put in special hardware(a resistor IIRC) to separate the consumer and professional line(Quadro) and that could be easily fixed.

Right now you can buy a used GT 1030 equivalent in Quadro branding for $100 that lets you run four 5k monitors at once via mini displayport. They allow 10bit color on consumer cards now, that's an artificial restriction they lifted when HDR arrived.
 
I'd say to assume the M1 graphics to be at above GeForce 1050 level for 7C/8C with 8 GB. For 8C with 16 GB, it's harder to tell, but I'd say between a 1060 6 GB and a 1070. We'll probably have to wait for Baldurs Gate 3 and Metro Exodus to come out to be sure.

It's quite the interesting chip, even though the general consensus is "Apple did it... but we're not sure how."
That being said, for gaming Mac isn't the best platform, unless you like indie stuff.

There's rumours, quite a few, going around about a M1X, which is supposed to be for a 14 inch MBP, top end Mini, and 21 and 24 inch iMac though, and if the rumours are true, that's supposed to be packing 12 cores, up from the 7/8 of the M1, and giving a 40-50% boost.

If that thing launches and graphics card prices are still ludicrous, it could be a very interesting dilemma.
 
Always remember that Apple can and will deem hardware like this "obsolete" a few years down the road and you'll stop getting updates and they might even sabotage how well the hardware is running/how useful it will be. Since this thing will forever stay incredibly locked down, you won't even be able to install anything else on it and will forever be at the mercy of Apple for everything. Don't be hopeful for Linux "fixing" this, it might eventually, some day, maybe, boot a kernel but I can gurantee you that the GPU will never be supported. If you think Apple might change their minds about maybe releasing specs to kernel developers to make it compatible when this thing is old some day let me remind you that they never even did that for their 68k Macs from the 90s.

The SoC is cool but IMHO, that alone makes it a non-starter.
 
Always remember that Apple can and will deem hardware like this "obsolete" a few years down the road and you'll stop getting updates and they might even sabotage how well the hardware is running/how useful it will be. Since this thing will forever stay incredibly locked down, you won't even be able to install anything else on it and will forever be at the mercy of Apple for everything. Don't be hopeful for Linux "fixing" this, it might eventually, some day, maybe, boot a kernel but I can gurantee you that the GPU will never be supported. If you think Apple might change their minds about maybe releasing specs to kernel developers to make it compatible when this thing is old some day let me remind you that they never even did that for their 68k Macs from the 90s.

The SoC is cool but IMHO, that alone makes it a non-starter.

Also, daily reminder that Apple deliberately released a software update for iPhones that checked to see if it had been third party repaired and then bricked them. It's not inconceivable that they will try that again. After all, "Error 57" hurt their sales by... absolutely nothing.
 
Always remember that Apple can and will deem hardware like this "obsolete" a few years down the road and you'll stop getting updates and they might even sabotage how well the hardware is running/how useful it will be. Since this thing will forever stay incredibly locked down, you won't even be able to install anything else on it and will forever be at the mercy of Apple for everything. Don't be hopeful for Linux "fixing" this, it might eventually, some day, maybe, boot a kernel but I can gurantee you that the GPU will never be supported. If you think Apple might change their minds about maybe releasing specs to kernel developers to make it compatible when this thing is old some day let me remind you that they never even did that for their 68k Macs from the 90s.

The SoC is cool but IMHO, that alone makes it a non-starter.
Yup. Even if Apple had a lucid moment enough to actually innovate something and do a good job of it, it's still Apple. Walled-garden, locked down hard, closed source and proprietary software, no external documentation or access to knowledge about internals, restricted development ecosystem and toolchains, and often very active and downright hostile actions meant to thwart curious third parties from digging too deep and/or "breaking the hardware free" of its various chains.

Fuck them and their tech, even if it's occasionally nifty.
 
Yup. Even if Apple had a lucid moment enough to actually innovate something and do a good job of it, it's still Apple. Walled-garden, locked down hard, closed source and proprietary software, no external documentation or access to knowledge about internals, restricted development ecosystem and toolchains, and often very active and downright hostile actions meant to thwart curious third parties from digging too deep and/or "breaking the hardware free" of its various chains.

Fuck them and their tech, even if it's occasionally nifty.
"But dood! Didn't you see the benchmark performance of the Apple created hardware running inside of the Apple controlled environment!"

It better be damn fast for everything that you lose.
 
"But dood! Didn't you see the benchmark performance of the Apple created hardware running inside of the Apple controlled environment!"

It better be damn fast for everything that you lose.

Well this is it. Apple came out with "M1 is faster than 98 percent of all other laptops!" but failed to mention that they firstly only used Geekbench for that, and cherry picked the use cases. It's also probable that they did something equivalent to what Microsoft did with the AARD code back in Windows 3.1 days.

The AARD code was a flag that Microsoft put in Windows 3.1 to deliberately gimp the performance of systems not running Windows 3.1 atop MS-DOS. It checked to see at loading to see if it had DR-DOS installed and if it did, it deliberately locked out Windows, and programs running under it, from certain APIs and system calls so it would chuck errors it didn't need to. Then, people would ring the tech support line and ask, "hey, what's wrong with Windows?" and when they said they were using DR-DOS, they would be given a load of technobabble about how DR-DOS wasn't totally compatible with Windows and if they only used MS-DOS instead, they wouldn't have this problem. It was one of the reasons why Microsoft got his with anti-trust charges in the late 1990s.
 
Well this is it. Apple came out with "M1 is faster than 98 percent of all other laptops!" but failed to mention that they firstly only used Geekbench for that, and cherry picked the use cases. It's also probable that they did something equivalent to what Microsoft did with the AARD code back in Windows 3.1 days.
I read one interesting article a while back that had a big thunk moment. The M1 is 1 core, 1 thread, and in single core benchmarks with Intel/AMD they only run 1 core/1 thread even though there are legit performance reasons for letting SMT-enabled CPUs run both threads for tasks that aren't explicitly multithreaded, that's why they exist. The difference between a Core i5 and Core i7 used to be that one had twice as many logical processors but the same amount of cores and that made it run Counterstrike or some other ancient single-threaded monolith faster.

[in Cinebench] We saw between 20% to 30% improvement in "single-core" results while allowing x86 SMT-based processors to utilize the second thread associated with the same core. For those interested, Geekbench also saw an average of 20-25% improvement with the same technique.
 
I read one interesting article a while back that had a big thunk moment. The M1 is 1 core, 1 thread, and in single core benchmarks with Intel/AMD they only run 1 core/1 thread even though there are legit performance reasons for letting SMT-enabled CPUs run both threads for tasks that aren't explicitly multithreaded, that's why they exist. The difference between a Core i5 and Core i7 used to be that one had twice as many logical processors but the same amount of cores and that made it run Counterstrike or some other ancient single-threaded monolith faster.



And as expected, the comments are full of word salad copium from Apple fanboys.

The thing is, let's be honest, most Macbook Air M1 users are going to be sneering hipster cunts who pose with it in coffee shops and occasionally do some light photoshopping.
 
Back