Ask HN: How to get aggregated user behaviour without tracking an individual user
For simpleanalytics.io I don’t want to track individual users and I would love to give businesses insights in the combined user behaviour.
A business could ask: "How many visitors converted from DDG to sign up and what is the average duration?" To be able to calculate the conversion between landing and signup you need to know the history of events.
Let's say we have a few events including:
- Page view event with referrer DDG
- Signup event
The data could look like this:
[ ['/','mysite','ddg.com'], ['signup',30] ]
When an event happens I add it to a function session cookie (exp 30 min) and send the complete cookie to the API. The time of the first request will be stored in another cookie and never send the API.
[ [event, your website, referrer], [event, duration since first event] ]
The two requests from the above example looks like this:
When the first request happens it gets added to the database (see row 95):
The second request contains the information of the first request. When a request comes in with more than 1 array item it will look for the previous events in the database. It will look for a row where event=/, referrer=ddg.com, site=mysite.com, and time is >30 min ago: row 94. The table after adding the row will look like:
id | time | event | site | referrer | link 94 | 20:30:20 | / | mysite.com | ddg.com | NULL 95 | NOW() | / | mysite.com | ddg.com | NULL <---
The conneted row can be 30 min off, but I think that's okay.
id | time | event | site | referrer | link 94 | 20:30:20 | / | mysite.com | ddg.com | a 95 | 20:38:28 | / | mysite.com | ddg.com | NULL 96 | 20:30:50 | signup | mysite.com | | a <---
Do you think this is acceptable from a privacy perspective?