- Joined
- Jan 13, 2018
Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
But all I can find is scientific articles on load balancing that say "our method is 0.17% better than the method from that other article". I want something like this (this is the page that got me into programming, I had to win a demo of a text adventure and didn't feel like manually adding 5-digit edge costs in a 25x25 8-direction grid)."every queue has a task point cost, each task executed from a queue subtracts its cost from its point reserve and gives a point to each other queue".
Look into scheduling, this is similar to how operating systems decide which processes to run.<snip>
If you want something simple but effective, just use simple probability weighted to priority scores given to each job. If you pick a good ratio, you can balance things reasonably well by having the worker occasionally take lower value jobs based on some random number generated when it is looking for a new job.Is there a mathematical apparatus for this?
I feel like there should be something, on the level of "vector algebra", that I just don't know exists.
I have a (python + celery) worker program that processes tasks from a (rabbitMQ) queue. Some tasks come from more important, better-paying customers and therefore have "higher priority" than others. A naive "always do the highest-priority task" prioritization would see no tasks from smaller customers getting completed until the end of the day.
What I currently do is have several queues and more workers assigned to just the high-priority customers. This is bad, because if the high-priority queues are empty, these workers then sit idle and eat virtual machine money.
I have several "high-level" (aka stupid) ideas involving how all of this should work but all of them involve "human" thinking, like looking at the big picture and saying "well duh". But a worker doesn't have access to big picture "well duh" statistics and I'd rather not make a centralized load balancer.
There are probably simple algorithms that can make a worker decide which task to take, like
But all I can find is scientific articles on load balancing that say "our method is 0.17% better than the method from that other article". I want something like this (this is the page that got me into programming, I had to win a demo of a text adventure and didn't feel like manually adding 5-digit edge costs in a 25x25 8-direction grid).
where you may be picturing some form of quantization into weighted queues.I feel like there should be something, on the level of "vector algebra", that I just don't know exists.
I don't really understand the question here. You have come up with a method of assigning priorities to jobs. But then you find a reason why the priority system is producing undesirable results. By definition, this means that your low-priority jobs are higher priority than you were accounting for. I don't think you can ask someone else for an out-of-the-box solution to what sounds like it's going to depend very highly on parameters that only you can set weights to.I have a (python + celery) worker program that processes tasks from a (rabbitMQ) queue. Some tasks come from more important, better-paying customers and therefore have "higher priority" than others. A naive "always do the highest-priority task" prioritization would see no tasks from smaller customers getting completed until the end of the day.
What I currently do is have several queues and more workers assigned to just the high-priority customers. This is bad, because if the high-priority queues are empty, these workers then sit idle and eat virtual machine money.
Because this seems like a video game (or maybe some interview problem) rather than actual codeAlso... if you have multiple workers running queued jobs, why are they pulling from separate queues? If you just create one job queue, and it's sorted in order of priority, then you won't have workers sitting idle just because "their" queue emptied out. They'll just grab the highest-priority job in the queue, even if it's lower priority. Just make sure that two workers can't try to grab the same job at the same time.
Distributed systems of any form need some way to balance load.Because this seems like a video game (or maybe some interview problem) rather than actual code
I didn't, this is why I put quotation marks to indicate these are not real mathematical priorities but user-story, human-language priorities.I don't really understand the question here. You have come up with a method of assigning priorities to jobs.
Don't be autistic, this is a common natural-language constraint that humans have been dealing with with since before there were computers. You have an important client whose tasks need to be favored but not to the exclusion of everything else.I don't think you can ask someone else for an out-of-the-box solution to what sounds like it's going to depend very highly on parameters that only you can set weights to.
I'm new on this project. It uses celery with rabbitmq underneath (originally it was just rabbitmq and a custom class, I rewrote it to use celery because the boss told me to (he's a computer scientist). I don't have a table of tasks I can query, these are all celery tasks. I can't write a priority heap message broker. Rabbitmq doesn't have priority queues, what it calls priorities is extremely homosexual:Also... if you have multiple workers running queued jobs, why are they pulling from separate queues? If you just create one job queue, and it's sorted in order of priority, then you won't have workers sitting idle just because "their" queue emptied out. They'll just grab the highest-priority job in the queue, even if it's lower priority. Just make sure that two workers can't try to grab the same job at the same time.
If priority queues are what you want, this information previously stated values between 1 and 5 are highly recommended. If you must go higher than 5, values between 1 and 10 are sufficient (keep it to a single digit number) because currently using more priorities consumes more CPU resources by using more Erlang processes.
It's fucking rabbit, I don't have a table of tasks I can query. It would've been easy if I had a view of all the tasks and had to write a priority function on it, I've done it a lot (orders for human dispatchers, moderation queues for media censors).And finally, if the result (low-priority jobs never get done) is objectionable, why not just have one worker that will always pick the oldest job in the queue, instead of the job with the top priority. Or have all workers have some random chance that they'll pick the oldest job instead of the best paying job. Or, instead of having a fixed priority, have the priority of a job gradually increase the longer it's been waiting in the queue.
Peek the non priority queue and check if the top has been waiting too long and process it if it has, else let it go back into the queue and pull one off the priority queue.It's fucking rabbit, I don't have a table of tasks I can query. It would've been easy if I had a view of all the tasks and had to write a priority function on it, I've done it a lot (orders for human dispatchers, moderation queues for media censors).
What I was trying to get at is that it sounds like your code is working as designed, but when you're rating its performance you're using additional factors that the code itself was not designed to take into consideration. Your natural-language constraint is different from the constraint that's actually in your code, and that's your problem.Don't be autistic, this is a common natural-language constraint that humans have been dealing with with since before there were computers. You have an important client whose tasks need to be favored but not to the exclusion of everything else.
If you are asking about something like Dijkstra Algorithm, I think you are basically asking about Linear Programming. These are solved as "vector algebra" as you called them but you don't have to think of it that way for most solvers.Is there a mathematical apparatus for this?
I feel like there should be something, on the level of "vector algebra", that I just don't know exists.
Replace white house with DoD, gen z with gen x, and rust with Ada.
The similarity all the way down to extensively over-engineered 'memory safety' is indeed strikingReplace white house with DoD, gen z with gen x, and rust with Ada.
This doesn't need math, it just needs a tool that can have multiple workers taking care of jobs. Good thing you're using Celery!Some tasks come from more important, better-paying customers and therefore have "higher priority" than others. A naive "always do the highest-priority task" prioritization would see no tasks from smaller customers getting completed until the end of the day.
Frankly, it's almost exactly the same. It's been a long time since I've read it but I remember the discussion of scheduling in Operating Systems: Three Easy Pieces being very lucid (and the book overall) and there's a free version here (plus ... you know what to do):Look into scheduling, this is similar to how operating systems decide which processes to run.
Oh shiIf you want something simple but effective, just use simple probability weighted to priority scores given to each job. If you pick a good ratio, you can balance things reasonably well by having the worker occasionally take lower value jobs based on some random number generated when it is looking for a new job.
Without splitting hairs or getting bogged down in philosophy, either way you'll have to explain the priorities such that a computer can understand themI didn't, this is why I put quotation marks to indicate these are not real mathematical priorities but user-story, human-language priorities.
That reminds me of an interesting mechanic in the game Star Traders: Frontiers. So the simplified version is that for some missions there will be a card game where five cards are drawn and one of them will be the "Mission Success" card, initially rated at 20%. (The other four will be various risks, opportunities, gains, disasters.) Of course most of the time you will not draw "Mission Success" first. But then on the next round that card will be weighted to 22%, then in subsequent rounds 24%, 26%, etc. until drawing it becomes a near certainty over all of those rounds. (I think once I had to go to over 50%.) That might also be a way to handle low-priority jobs for OP.The naive "always do the highest-priority task" should work, if old tasks get increased in priority so that they won't get excluded entirely. Because that factor is important to you, but currently isn't important to your code.
Discussion of operating systems and genetic algorithms in this thread? Now we're really speaking my language lol.Frankly, it's almost exactly the same. It's been a long time since I've read it but I remember the discussion of scheduling in Operating Systems: Three Easy Pieces being very lucid (and the book overall) and there's a free version here (plus ... you know what to do):
https://pages.cs.wisc.edu/~remzi/OSTEP/
This may be a stretch but I'm also reminded of some of the literature from evolutionary computing. The following comes from An Introduction to Genetic Algorithms (Mitchell):
"A common selection method in GAs is fitness−proportionate selection, in which the number of times an individual is expected to reproduce is equal to its fitness divided by the average of fitnesses in the population. (This is equivalent to what biologists call 'viability selection.')
My thought with grouping into queues is that true fairness and equality in queuing systems are annoying in practice to the end users and affect incentives towards the extrema. So a floating window/range for each queue priority level based on average volume of job types would help ensure customers/users on the lower extremum of the priority spectrum have a sort of luck based incentive to roll the dice instead of waiting and saving up to pay more some time later. You could even dither a bit while sorting from the scores for even more randomness.So basically what I'm trying to say here is that if OP can figure out how to make the really important jobs occupy x% of the slots on a virtual roulette wheel, and the less important jobs the remainder (or even split into three or more categories), OP can just "spin the wheel" and the less important jobs will have their time sooner or later while the priorities are still respected on average.
lol one big takeaway from my time working in math and CS is that averages and the law of large numbers are more much powerful than many people think. I run into a lot of similar problems when doing signal processing work.Oh shi
I swear I just read this now
They're not mutually exclusive. My understanding is that early recurrent neural networks used an evolutionary computing approach because traditional backpropagation no worky and also that NEAT (Neuroevolution of Augmenting Topologies), where the entire topology of the network is permitted to evolve, is a popular approach to evolutionary computing in neural networks today.Discussion of operating systems and genetic algorithms in this thread? Now we're really speaking my language lol.
I always thought genetic programming seemed like a really cool counterpart to neural nets when it comes to iteratively self-improving over time. I used to frequently discuss such algorithms with a biology prof buddy I had over many lunch breaks.
How do I force a worker to consume a task from a particular queue at runtime with celery? If I could do this, it'd solve 90% of the problem (I'd still need to watch and tune priorities). I don't need working code, please just point me to the celery feature that does this.Have your high priority workers handle lower priority jobs if there's nothing for them to handle for a specific amount of time. Once a high priority task comes in again, handle those until you hit the timeout again.
Yeah, I was thinking about something like this, or buckets to emulate this without randomization.A simple method of implementing fitness−proportionate selection is 'roulette−wheel sampling'
Thank you! I'll start reading the book now.Frankly, it's almost exactly the same. It's been a long time since I've read it but I remember the discussion of scheduling in Operating Systems: Three Easy Pieces being very lucid (and the book overall) and there's a free version here (plus ... you know what to do):
I know! I can make a http microservice that explicitly ranks all queued tasks and responds with the next task to do whenever a worker hits it (I've done it a lot for humans). But I'd rather emulate the behavior with scores or probabilities, it's not a generic case where Anything Can Happen and tasks can be reranked every which way, a separate service looks like overkill.Without splitting hairs or getting bogged down in philosophy, either way you'll have to explain the priorities such that a computer can understand them
Learn C then Python.How does Dart stack up as a first language?