This might be a total noob question, but I'm currently trying to learn SQL/MySQL dialect and have run into a query I cannot quite figure out. I have two tables: "products" which has columns "product_id" (1 ,2, 3, ...) and "category" (A, B, C, ...) features and a table called "similarity" which has rows "product_id_1", "product_id_2", and "similarity". The latter is essentially a similarity matrix between each two products. How can I find the number of product id's that have less than N related products of category A with some specific similarity score?
Previously I have mainly worked with databases using a programming language's API. If I wanted to go that route, I would first query all product id's, then for each id query something like
Code:
select * from similarity inner join products on if(product_id_1='?1',product_id_2,product_id_1)=products.product_id where similarity = '?2' and category= '?3';
for an individual id which gives me a result of (product_id_1, product_id_2, similarity, product_id, category) for that specific category and similarity from which I can simply count the number of returned rows and check if the condition is true or not. However, this approach doesn't seem very efficient because I have to query for every single id separately which itself probably takes pretty long time if the dataset is not a toy example and should the query requirements change, I would need to query for each id again. For this reason, how can I query in MySQL over all id's simultaneously and have it return me the count of products which evaluate to true for the abovementioned criteria?