GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

x²=2, solve for x.

Sure. 16 bits enough? You can use integers to do fixed-point arithmetic. This is how 1337 h4xx0rz did it in the good old days.


1711654664633.png

Program stdout
sqrt(2.302000) = 1.517234
Fixed point value is 2.301
Fixed point sqrt is 1.516
sqrt(2.010000) = 1.417745
Fixed point value is 2.007
Fixed point sqrt is 1.416

EDIT: There's a bug in the conversion from int format to printed decimal and vice-versa if the first decimal is zero. I don't care to fix it.
EDIT 2: I lied, I do care.
 
Last edited:
Modern processors are huge, complicated black boxes and unintended side effects are pretty much the name of the game and that's the accidental stuff. I don't even wanna know what series of innocent-looking instructions can give you ring 0 on your average consumer CPU nowadays.

First CPU I had that could do branch prediction was from Cyrix and you had to enable it by poking a register as it was included as an experimental feature. Enabling it gave quite the speed boost but would be somewhat unstable with everything 32 bit. This was a CPU that'd go to 90 percent utilization from playing a mp3, mind you. That was a time when I felt computers were still safe because they lacked any pretense to be. Now it's layers upon layers of obfuscation and encryption and embedded "security" processors and memory protection and even hardware in the same system having safeguards about not trusting the other hardware and every few months, like clockwork, the whole shitheap is bypassed by one implementation bug or other anyways. I think personally it's the pretense that it is safe that irks me, instead of the simple truth that it isn't. When a program on my Amiga goes down it has the capability to take everything with it because there's no memory protection and every program can access the whole memory space of any other program. And you know, that feels honest, in a way. (Makes you weed out bad software really quickly, too)

But jokes aside, most of these problems are problems because of the interconnectedness of everything and the inherent oppertunistic nature of man. Who knew runing abritary code from random servers all over the world could cause security issues for the end user? Shocking. When AI is good enough I will become a digital hermit on some very primitive computers and will only indirectly interact with everything online via AI. Kinda like how stallman uses the internet, just more SciFi. Hell is other people.
 
I had heard the main issue is a full amd64 implementation tends to take up more die area and is adverse for that reason, article doesn't seem to go into that one
I don't see that being necessary except in weird cases where every micron of space is important
 
I had heard the main issue is a full amd64 implementation tends to take up more die area and is adverse for that reason, article doesn't seem to go into that one
If we want to talk die area, I want to see Cortex-X4/X5 compared with Zen 4c/5c. Since the Zen core shrinks 35%, retains its full capabilities including AVX-512 and 2 threads per core, drops clock speeds to the ballpark of where the fastest Cortex cores would be clocked in smartphones, and is paired with less L3 cache (ARM designs support up to 32 MB L3 per cluster but they tend to aim low in mobile).
 
I don't see that being necessary except in weird cases where every micron of space is important
Well, it always is. The fabs (more or less) charge by the wafer so the more chips fit in the wafer the better off you are.

If we want to talk die area, I want to see Cortex-X4/X5 compared with Zen 4c/5c. Since the Zen core shrinks 35%, retains its full capabilities including AVX-512 and 2 threads per core, drops clock speeds to the ballpark of where the fastest Cortex cores would be clocked in smartphones, and is paired with less L3 cache (ARM designs support up to 32 MB L3 per cluster but they tend to aim low in mobile).
Fair TBH. I had mostly heard it come up in relation to apple's M chips, that that was one of the main reasons they were getting decent performance for their price point despite being objectively bad at that as a company.
 
I had heard the main issue is a full amd64 implementation tends to take up more die area and is adverse for that reason, article doesn't seem to go into that one

Very little of the die area is taken up by the decoder.

Fair TBH. I had mostly heard it come up in relation to apple's M chips, that that was one of the main reasons they were getting decent performance for their price point despite being objectively bad at that as a company.

More energy is spent on transfer than, so having everything on the same die (GPU, CPU, RAM) brings energy consumption down a lot. Then they use LPDDR5 rather than more energy-hungry DDR5. On top of that, energy consumption varies like the square of clock speed, and Apple keeps the clocks down on their chips.
 
  • Like
Reactions: Vecr
Low-end Meteor Lake has appeared. Behold, the Core Ultra 5 115U with 2 P-cores, 4 E-cores, and 2 LP E-cores:


Based on the Wikipedia page I guess it was first noticed around March 7, but nobody cared.

It's reminiscent of parts like the dual-core i3-1115G4, and the nearly identical 2+4 cores: i3-1215U, i3-1315U, and Core 3 100U. Except Intel was unwilling to call it "Core Ultra 3". It has 3 of 4 Xe cores (48 EUs), which is less than the 64 EUs the 1215U+ had but should still be faster. Also 10 MB of L3 cache, same as the 1215U/1315U/100U.

One could argue that Intel is selling 8 "cores" where it was once selling a dual-core, but Raptor Lake Refresh (100U) is being sold concurrently and should be cheaper.
 
Last edited:
  • Thunk-Provoking
Reactions: Betonhaus
Low-end Meteor Lake has appeared. Behold, the Core Ultra 5 115U with 2 P-cores, 4 E-cores, and 2 LP E-cores:


Based on the Wikipedia page I guess it was first noticed around March 7, but nobody cared.

It's reminiscent of parts like the dual-core i3-1115G4, and the nearly identical 2+4 cores: i3-1215U, i3-1315U, and Core 3 100U. Except Intel was unwilling to call it "Core Ultra 3". It has 3 of 4 Xe cores (48 EUs), which is less than the 64 EUs the 1215U+ had but should still be faster. Also 10 MB of L3 cache, same as the 1215U/1315U/100U.

One could argue that Intel is selling 8 "cores" where it was once selling a dual-core, but Raptor Lake Refresh (100U) is being sold concurrently and should be cheaper.
What is that even for? Laptops that have different cores enabled for battery and plugged in modes?
 
What is that even for? Laptops that have different cores enabled for battery and plugged in modes?
Proof of concept? Wouldn't be the first time Intel sold a stupid product that should have stayed in engineering. Assuming the different types of cores are all different tiles, it makes sense.
 
  • Like
Reactions: Vecr
What is that even for? Laptops that have different cores enabled for battery and plugged in modes?
The LP E-cores are on a different "tile" (chiplet) than the main CPU cores. I believe Intel's scheme is to turn off the CPU tile entirely when workloads only require the performance of the SoC tile, for mundane tasks like playing a YouTube video. The video decode/encode and display engine have also been separated from graphics and put onto the SoC tile. More relevant on battery than anything, but there has been a leak showing LP E-cores coming to desktop CPUs in the future.

The path forward for LP E-cores is confusing. Lunar Lake appears not to have them despite targeting similar ultra low power applications like the low TDP Meteor Lake-U models with on-package memory. Arrow Lake desktop CPUs might not have them but an MLID leak from October showed 2-4 of them in every upcoming desktop product. The benefits of LP E-cores in desktop are questionable but it could result in very low idle power usage which will look great for ESG. ESG cores?

That sounds like a total mess for scheduling. I thought the E-Cores were Skylakeish clock for clock, which begs the question, what is performance like for the LP E-Cores? Ivybridge? Haswell? Lynnfield?
I hope they know what they're doing, and I can't wait to read a Chips and Cheese article on it. Actually, there is one, but we'll probably see more testing later:
This private information is unavailable to guests due to policies enforced by third-parties.

Intel claimed a +4% IPC gain for Crestmont over Gracemont, and both E-cores and LP E-cores use that. LP E-cores have low base clocks ranging from 0.4 to 1 GHz in Meteor Lake, which is not too dissimilar from other cores (e.g. Core Ultra 5 125H is 1.2 GHz base for P-cores, 0.7 GHz for both E-cores and LP E-cores). But they have a max turbo of 2.1 GHz in Meteor Lake-U, or 2.5 GHz in Meteor Lake-H. So you could probably still compare it to Skylake, just clocked really low and without hyperthreading such as in the i5-6400T or Celeron G3900T. And it will be different due to other quirky behavior such as no L3 cache access.

IIRC, benchmarks (unrealistic stress tests) will use every core available including LP E-cores, so they are not somehow inaccessible to the user as was previously speculated. But the primary purpose of the LP E-cores is to run while other cores aren't running, and punt tasks up the chain to the E-cores when the going gets tough.

I think AMD's conservative approach to "E-cores" will work very well with schedulers. Technically it has even landed on the desktop if you count the 8500G and 8300G Phoenix2-based desktop APUs, but it remains to be seen if they will pursue it for real with 16-core Zen 5C chiplets.
 
Last edited:
  • Informative
Reactions: geckogoy
Intel's new scheduling approach to E-cores is the reverse of what they did with 12th-14th gen, where processes start on P-cores and only get demoted if they prove to not be too demanding (or the P-cores fill up).

What they're doing with LP E-cores makes sense when you take into account that their biggest threat in laptops is Apple, not AMD. They're close to process parity with TSMC, and AMD's genuine design advantage is in servers, not desktops. Perhaps this is because so many AMD people are ex-IBMers. Power was king of supercomputing BITD (Power10 is a beast as well, just nobody seems to buy it)

Apple has normalized the all-day laptop, and the x86 machines can't compete with 3-hour battery life.
 
Why must prices for gpus be so shit in this day and age? I keep putting off plans to build anything because I keep balking at the price of everything. Maybe it's just the inflation catching up to me despite getting paid more.
If you get paid 2% more while inflation raised euro prices by 10%, you’re actually getting paid significantly less.
 
  • Feels
Reactions: Brain Problems
Back