I understand what you're saying but I think you're mistaken. Some of it has been known, sure. But definitely not all. The type of data they're collecting has never been collected before. Like I said, they didn't have the devices back then that they do now. They didn't have the data annotation software. They didn't know which metrics to even record. A psychometric test compared to what they are collecting today is like... comparing a lego toy set to a jet engine. It doesn't even come close.
We are talking about millions of humans measured in real time on dozens of metrics both biological and cognitive when presented with custom tailored tasks that of themselves have near infinite variety. They could divide the applicants into test groups based on any variety of metrics like location, age, languages spoken, race, present different tasks to different groups at different times of day... the data points are nearly endless.
EDIT: a key point of clarification: in the past, they were able to study and predict how humans reacted. This is what data harvesting through cellphones catered to. This new paradigm of data harvesting, however, is allowing them to model how humans actually think when given cognitive tasks. The "gig work" is just a ruse for that. Maybe they're training human-like agents with the data. Maybe they'll use it to detect individuals just based on their thinking alone. Who knows, it's pretty far reaching.