-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
techniques for larger datasets
- rectangular binning (plus scaling counts) see options in vega https://vega.github.io/vega/examples/density-heatmaps/. I think this could be used to replicate the results in Cook and Miller,2006.
- continuous hexagonal binning (plus scaling counts) . We can think of binning counts as a streaming algorithm (i.e. use something like count-min-sketch data structure)
- random sampling (or stratified random sampling if interest is in clustering)
- transparency
- displaying largish categorical data (I.e. cluster labels) via hextris (more for static displays or linking
preprocessing transformations
- add kernel PCA decomposition as a preprocessing step
- scaled distances based on kNN
spatial linking
or manipulating subgraphs of kNN (on data and embedding space) via brushing operations
- LC meta criterion, mean relative rank error, neighborhood loss
- use the k-NN indexes to form a tangent map on the data space / embedding space. Rescale the data by average neighbourhood mean and then compute the SVD again (the left singular values will be an estimate of the tangent)
- can use the aforementioned singular values to produce new display axes
manual clustering
via persistent brushes, stability of k-means etc.
Metadata
Metadata
Assignees
Labels
No labels