Cohort Retention

Cohort Analysis

Cohort retention tables are a valuable visualization that allows you to segment your data into different buckets to track their behaviour or performance over time; such as a cohort of users that sign-up in a specific month, and how active or engaged each cohort is over time.

This allows you to identify common characteristics each cohort typically experiences over a set amount of time, allowing you to anticipate how active or engaged your customers will be as they progress through each month.

The example above buckets users into cohorts based upon the month of their first purchase, and tracks how often they engaged with the service again in the following months.

  • For example, the Aug 2016 cohort had 8.92% engagement in Month 1, meaning 8.92% of the original cohort size of 4,226 users engaged the service again in Month 1, and this improved to 21.03% engagement of the original cohort in Month 4.
  • Likewise, the Sep 2016 cohort saw 12.43% engagement in Month 1 from the original cohort size of 4,573 users, which improved to an engagement level of 26.39% of the original cohort in Month 4.
  • Based off this cohort retention analysis, you may then want to ask additional questions as to whether this pattern of user engagement is seasonal, or whether this behaviour seems to hold across different cohorts!

How to Build: Cohort retention table

Cohort Analysis

1. Input Format

Your SQL query needs to return the following 4 column categories required for the cohort retention table visualization, and you can name the columns however you like. The cohort retention table in the example was generated from the following data table.

  • Cohort Name: Common values for grouping each cohort, such as ‘Aug 2016’, ‘Sep 2016’, ‘Oct 2016’ etc. within the ‘Cohort’ column.
    • The 'cohort' column in this example uses the string data type, but the Cohort Name column can be of any data type.
  • Cohort Size: The fixed population count of a given cohort as specified for the analysis. This value is held constant for each cohort, as seen in the ‘Size’ column.
    • The Cohort Size column needs to use the number data type.
  • Duration: This value indicates the time unit of analysis for the various cohorts, such as month, week, etc. Month values in the ‘duration’ column like ‘Month 0’, ‘Month 1’ and ‘Month 2’ were used in this example.
    • The 'duration' column in this example uses the string data type, but the Duration column can be of any data type.
  • Value: The value of the desired measure to be displayed for the relevant unit of time for the cohort analysis, i.e. in Month 1 for the ‘Aug 2016’ cohort, engagement was recorded at a value of 8.92.
    • The 'value' column will be converted in to percentage values based on the cohort size.
    • The 'value' column in this example uses the number data type with the 'Auto' format.
    • The Value column needs to use the number data type.

Cohort Analysis

Access the Visualization Editor at the bottom of the Report Editor screen, and select the 'Retention' table option button. If you have built your SQL query correctly to return the necessary columns, Holistics will then plot the data columns as a cohort retention table, as shown in this example.

Cohort Analysis

2. How Do I Write The SQL?

For a detailed walk-through of writing the SQL for cohort retention, refer to our blog post here.

Legacy Mode: Retention heatmap automatically calculates the percentage values, therefore you don't have to manually calculate this values via SQL (old behavior). Considering re-writing your SQL to return raw values directly if you encounter a legacy report.

Cohort Legacy Mode