Jonathan Mallia - Saturday, July 22, 2017 9:32 AM
Hi Jonathan,
How do you set your thresholds for an automated process? With a lot of care! Have a quick read of R's documentation for DBSCAN. Any points not assigned to a cluster are labelled with a zero.
The question then, is how do you pick the hyper-parameters (the neighbourhood radius and min points in the case of DBSAN)? I would be very nervous about leaving this to a truly automated process. Like any science, you will need to collect data, run proof of concepts, evaluate these and develop your solution on historical data. It's the only way to do anything and retain confidence in your future results.
Cheers,
Nick