I am trying to pull some analytics from my django model table. So far I can count total values of a field and distinct values of a field. I also know how to create lists showing total values of fields within a distinct field. Now i'd like to count the distinct instances a field occurs within a list of already distinct values of a different field. Here's the table I am working with:
| uid | cid |
|-------|--------|
| a | apple |
| a | apple |
| a | grape |
| b | apple |
| b | grape |
| c | apple |
| c | pear |
| c | pear |
| c | pear |
So the result I am trying to provide is:
cid: apple (distinct uid count: 3),
cid: grape (distinct uid count: 2),
cid: pear (distinct uid count: 1)
and also:
cid apple's distinct uid's: a, b, c
cid grape's distinct uid's: a, b
cid pear's distinct uid's: c
So far I have been able to get distinct counts and lists like this:
dist_uid_list = Fruit.objects.filter(client=user).values('uid').distinct()
output >>> {'uid': 'a', 'uid': 'b', 'uid': 'c'}
and this:
dist_uid_count = Fruit.objects.filter(client=user).values('uid').distinct().count()
output >>> {3}
and more complex:
total_actions_per_cid = Fruit.objects\
.filter(client=user)\
.values('cid').distinct()\
.annotate(num_actions=Count('action_name'))\
.order_by('cid')
output >>> {'cid': 'apple', 'num_actions': '4'}{'cid': 'grape', 'num_actions': '2'}{'cid': 'pear', 'num_actions': '3'}
So here is the question: how could I go in and take each distinct 'cid' and find a count of how many distinct 'uid's exist within each?
Here are all the django files that might be helpful to see:
models.py
class Fruit(models.Model):
uid = models.CharField(max_length=50, blank=True)
cid = models.CharField(max_length=50, blank=True)
record_date = models.DateTimeField(auto_now_add=True)
client = models.CharField(max_length=50, blank=True)
action_name = models.CharField(max_length=50, blank=True)
views.py
class DashboardListView(LoginRequiredMixin, ListView):
model = Fruit
template_name = 'blog/dashboard.html'
context_object_name = 'fruit'
ordering = ['-record_date']
def get_context_data(self, **kwargs):
user = get_object_or_404(User, username=self.kwargs.get('username'))
context = super().get_context_data(**kwargs)
dist_uid_list = Fruit.objects.filter(client=user).values('uid').distinct()
dist_uid_count = Fruit.objects.filter(client=user).values('uid').distinct().count()
total_actions_per_cid = Fruit.objects\
.filter(client=user)\
.values('cid').distinct()\
.annotate(num_actions=Count('action_name'))\
.order_by('cid')
context['dist_uid_list'] = dist_uid_list
context['dist_uid_count'] = dist_uid_count
context['total_actions_per_cid'] = total_actions_per_cid
html outputs
{% for user in dist_uid_list %}
{{ user.uid }}
{% endfor %}
{{ dist_uid_count }}
{% for action in total_actions_per_cid %}
{{ num_actions }}
{% endfor %}
I assume there needs to be some sort of forloop action and multiple defs involved in views to make this work. I just cant quite figure out how I should go about doing that.
The Count aggregate has a distinct parameter that may help:
>>> q = Book.objects.annotate(Count('authors', distinct=True), Count('store', distinct=True))
https://docs.djangoproject.com/en/3.1/topics/db/aggregation/#combining-multiple-aggregations
Thus your query would look like:
# I removed the distinct after .values, as the values works
# like a GROUP BY, thus you will get already unique 'cid's
total_actions_per_cid = Fruit.objects\
.filter(client=user)\
.values('cid') \
.annotate(num_uids=Count('uid', distinct=True))