Skip to content

Data Update No. 6

Data update No. 6 gives you the JDD Lab’s analysis of Reddit data to June 2024. The Goal of the Justice Data and Design Lab’s (JDD Lab) is to use interdisciplinary teams of graduate students (currently Law and Data Science students) to collect and analyze person-centred data and user experience to improve British Columbian’s access to justice.

The JDD Lab is based at the Access to Justice Centre for Excellence (ACE) at the University of Victoria. Lab Director Kate Gower and a team of graduate students use machine learning and AI to collect and analyze new data – today’s data is from the social media site, Reddit1.

Here is what is going on:

1. A Tight Cluster Around Cars and Insurance Problems

Cluster 5 is a remarkably tight cluster in this report. The smaller the cluster, the more cohesive the topic.

Three cool things about Cluster 5:

  1. Top keywords “car”, “stratum”, “insurance”, “vehicle” and “repair”.
  2. The words car, stratum, vehicle and repair ONLY appear in Cluster 5. To put  that another way, those terms do not appear in other clusters.
  3. It has 668 reddit posts (out of 4,331 posts collected to date).

This suggests a cohesive set of Reddit posts about cars and insurance. Remember: this interactive report is clustering ALL the Reddit data the JDD Lab has collected so far.

Set the Relevance slider to “0.6”2 for the best chance of identifying what the clusters are about (the Relevance slider is at the top right of the Report). We discussed Relevance in the LDA Report in Data Update No. 2.

Why this matters: Earlier Reports showed tighter clusters around Housing and Employment problems and Tenancy problems. Data Update No. 6 has a new development.

What we learned: The insurance cluster appears at the top of the list because it is a tight cluster – NOT because it is the topic which most people are asking about. In the chart below, we include the count of Reddit posts in each cluster. You can see that Housing and Employment still have the highest number of posts. Insurance has less posts, but is more cohesive.

The LDA analysis “approves” of tight, cohesive clusters. It lists the most cohesive cluster as its “top” cluster. That might make us think that this is the topic which most people are asking for help with. However, when we add the numbers of posts, we see that Housing is still the topic most people are asking about.

Data to June 2024 – Total Reddit Posts: 4331 

Cluster 5
Insurance
668 Posts
CarStratumInsuranceVehicleRepairTicket
Cluster 4
Housing
1336 Posts
LeaseTenancyDepositRentalUnitRoommate
Cluster 3
Employment
928 Posts
EmployerEmployeeContractHourEmploymentWorking
Cluster 2
Family
334 Posts Child
ParentDadMommotherMotherFather

2. Taking a Closer Look at Housing Data

The team at the Alberta Law Reform Institute gave us a call after seeing a presentation on the JDD Lab’s work. They are in the early stages of a project about residential tenancy law. Their goal is to make recommendations regarding Alberta’s Residential Tenancies Act and possibly some related legislation.

We struck up a collaboration with them, and the JDD Lab was able to send them nearly 630 Reddit posts from Albertans regarding housing.

Calling all BC organizations! If you have a project that would benefit from a review of Reddit data, please contact us. We have lots of data on Housing and Employment, and we would love to build a collaboration with you!

We are grateful for the support of the Law Foundation of BC and Mitacs. We could not do this work without them.

  1. The most recent Everyday Legal Needs survey undertaken by Statistics Canada in 2021 shows that most people take action to resolve their everyday legal problems, and the top two things most people do are to ask their family and friends and to look on the internet. The JDD Lab used programs that show where people go online when they look for legal advice, and found that the top place people go is to the social media platform Reddit. ↩︎
  2. For more information on why, see Data Update No. 2, or see: Carson Sievert and Kenneth E. Shirley, “LDAvis: A method for visualizing and interpreting topics” (2014) Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, Baltimore, Maryland, USA, June 27 at 67, online: The Stanford Natural Language Processing Group. ↩︎