python - What are noisy samples in Scikit's DBSCAN clustering algorithm? -

January 15, 2011

if apply scikit's dbscan (http://scikit-learn.org/stable/modules/generated/sklearn.cluster.dbscan.html) on similarity matrix, series of labels back. of these labels -1. documentation calls them noisy samples.

what these? belong single cluster, or each belong own cluster since they're noisy?

thank you

these not part of cluster. points not belong clusters , can "ignored" extent.

remember, dbscan stands "density-based spatial clustering of applications noise." dbscan checks make sure point has enough neighbors within specified range classify points clusters.

but happens points not meet criteria falling of main clusters? if point not have enough neighbors within specified radius considered part of cluster? these points given cluster label of -1 , considered noise.

so what?

well, if analyzing data points , interested in general clusters, lower size of data , cut out noise. or, if using cluster analysis classify data, in cases possible discard noise outliers.

in anomaly detection, points not fit category significant, can represent problem or rare event.

Search This Blog

RT

python - What are noisy samples in Scikit's DBSCAN clustering algorithm? -

Comments

Post a Comment

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -