TransWikia.com

Clustering data with a constraint

Data Science Asked by Aibek on February 2, 2021

I am trying to find a way to cluster/group students by their knowledge of different subjects.

Given following as an example:

            Subject 1   Subject 2   Subject 3
Student 1       4           5           1
Student 2       4           5           2
Student 3       5           2           1
Student 4       2           5           5
Student 5       5           5           2
Student 6       4           5           1
Student 7       2           2           5
Student 8       4           4           2
Student 9       1           2           1
Student 10      1           1           3
Student 11      1           2           1
Student 12      3           1           4

Also given that the number of students per group should be between 3-4 (could be anything, I just felt that for this example those boundries make sense)

I would like to group students in such a way, that after grouping, the average value of a subject for a given group would approach the average value for that subject to the best of the ability.

Maybe one way to define this would probably be minimalizing the total variance of each subject between group and total.

How would you suggest approaching this problem? What algorithm would you suggest to use?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP