digging into DBSCAN next for the statistical dashboard and also storing snapshots for a decision tree which activates after 5-6 hours of running (2-3 days will give me more statistically relevant data to share) my main goal with this is to be able to identify idiosyncratic behavior and outliers across 545 Binance tickers with a quick glance. DBSCAN finds groups based on density i.e. points that are close together become a cluster and isolated points get flagged as outliers. The key difference from k-means: k-means forces every asset into a group no matter what. DBSCAN actually segregates and parses out idiosyncratic outliers better in this format. In the dashboard currently, each extended asset is described by 7 dimensions simultaneously > how extended, how long/short, velocity, rarity, volume, BTC correlation, and volatility regime. This is where I'm going to call it for now. Gathering some data and will share it in the article that I'm working on.
Stoic
StoicMar 23, 15:04
Trying out k means clustering now whereby the data gets split into groups using similarity. In this case: it takes every extended asset and measures five parameters: how extended the asset is, how long it’s been there, how fast it’s moving, how rare that level is, and how much volume is behind it. Four groups emerged: Noise spike: got there fast, already moving back. Brief touch, probably not worth trading. Slow grind: been extended for multiple time cycles, low velocity. Potentially trapped positioning building. Crowded position: extreme percentile rank, moderate volume. Squeeze or liquidation risk depending on direction. Thin market — low volume relative to extension. The z-score is technically valid but needs more digging. Detailed article to follow on the entire process.
TLDR: in the statistics trenches
8.37K