Ask HN: How should we implement tracking as privacy conscious startups?

10 points by kylerpalmer 5 years ago

As a consumer, I use an adblocker because I care about my privacy. As a startup founder, I use tools like Google Analytics to drive my understanding of how to reach users and solve their problems. When I visit my own website, the irony is not lost on me that my adblocker blocks GA and other tracking tools.

I'd like to know: what are HN's opinions and/or solutions to this dichotomy?

ziddoap 5 years ago

My 2c:

1) Don't go overboard. GA is one thing. GA plus twenty other obscure analytics cookies and scripts is another. If your site looks like [1], there is 0 chance of ever being on my white-list for ads.

2) Just be honest and transparent. You don't need a big "WE USE COOKIESSSSSS" banner, but a short little message along the lines of "We use GA to help develop a better experience for you. All data is in aggregate, and not identifiable. Please consider white-listing our website if you would like to participate in making this website even better".

[1] https://imgur.com/a/7AHbQ5E

  • kylerpalmer 5 years ago

    Thanks for the tips. How do you feel about anonymized only vs aggregated. For example: in-app (where user info is available to you) do you carry out any behavior tracking, a/b testing, etc?

    • ziddoap 5 years ago

      So, again this is just for myself as a consumer, it's all about approach.

      Ideally when I visit a site I'd like to see minimal tracking by default with an opt-in model. If a site tracks the bare minimum by default and unobtrusively asks me if I'd be okay with sharing a little bit more (or participating in certain testing, etc.) I will generally say yes. Showing me you care enough to let me opt-in vastly increases my trust in you as a provider. Opt-in is best, opt-out is not ideal but acceptible if clear and easy, no option = no whitelist.

      Also, I think if you are asking as a privacy conscious startup, you should consider taking a step back and really asking yourself how much you really need to track. What is essential vs. what is nice to have? Anything that you implement that is non-essential is another step away from being privacy conscious, in my books at least.

      We're living in a time where the default seems to be to track as many data points as possible, then figure out if they are relevent/useful. This is backwards. Implement the absolute minimum and see how it goes. A month down the road you might realize you desperately need to see X metric, so you implement X tracking. However, and more likely in my opinion, is that you realize that X metric adds minimal value to you - yet it tracks significantly more PII from your customers. It's all about balance.

Nextgrid 5 years ago

For analytics try to use self-hosted solution or even rely on your existing logs (if a feature is hitting endpoint X and you want to know how frequently it is used you can just grep for that URL).

If you really need to use a third-party, I'd prefer one where you pay for the service (MixPanel, etc) rather than a "free" one like Google Analytics. At least the paid one has less incentive to use the data for their own purposes while the whole point of Google Analytics (and the reason for it being free) is to provide data for Google's advertising business.

  • kylerpalmer 5 years ago

    Do you have an opinion about situations where 'tracking' is really what is needed. For example, keeping track of a referral from a partner, so that you can pay them for converted customers?

    • Nextgrid 5 years ago

      In this case, on the web, there is already a solution which is the Referer HTTP header (https://en.wikipedia.org/wiki/HTTP_referer) - it allows the destination web server (and no one else) to know the referring page, as long as both are served using the same protocol.

      If HTTP referrer isn't an option, a custom query parameter like ?referer=source_site is what I'd consider okay - it looks straightforward so someone can decide to remove it manually if they have a good reason to do so. I would avoid stuff like utm_ parameters as they just mean Google Analytics and not only is this blocked by default by my browser & DNS server it also means you don't care about my privacy and are happy for Google to track me (I never click on links in emails for this reason because they all got some kind of scummy shit to track me that involves a third-party).

    • ziddoap 5 years ago

      Are you paying a flat rate per conversion or a flex rate based on customer spending or some other metric?

      If its just a flat rate per conversion, this could be accomplished with minimal (if not no) PII collected from the customer.

      • kylerpalmer 5 years ago

        Ongoing based on revenue. Most PII is in our payment processor, and never stored on our servers.

laurentl 5 years ago

You don’t have to use GA for your own needs. If you have a traditional web site, you can get a lot done with nginx logs. Things get hairier with an SPA, but it isn’t too complicated to replicate Google Analytics’ “event” features on your own.

I recall someone posting a privacy-friendly analytics tool on show HN a few months back, so you don’t have to roll your own.

The downside is that you lose GA’s insight into who your users are (gender, interests, age group...). But that is some seriously creepy stuff when you stop to think about it.

morningmoon 5 years ago

What do you need to track, and how will you take action from it? It’s valuable to figure that out first.

You may not need user tracking at all. You can track how many signups came from which domain by setting a cookie with the referrer and incrementing a count for that domain at signup. Here’s an interesting post about it https://doingdone.app/blog/building-a-startup-without-user-t...

gorkemcetin 5 years ago

I also feel worried when my Ublock Origin warns me with big fonts. At least you can go with an on-premise option (e.g Countly, Fathom etc). Both can be deployed on a Digital Ocean instance via Marketplace in less than 60 seconds. Then, make sure you change your privacy policy to reflect that you don't share data with 3rd parties.

kylerpalmer 5 years ago

A few topics for discussion might include:

  * Anonymized and/or aggregated data 
  * Behavior tracking via cookies (in app vs marketing?)
  * Referral/Affiliate tracking
mars 5 years ago

we use matomo, an open source clone of ga (previously called piwik). if you don‘t want a big disclaimer to be eu/gdpr compliant you can turn off cookie dropping in the preferences. you will still be able to track sessions and conversions, but unique user counts will be off. https://matomo.org/what-is-on-premise/