Analytics
Table-level analysis, a query editor with dashboards, correlation tools, and experimental 3D stress surfaces.
Tables Efficiency
The default tab shows how effectively each table's ordering key serves the actual query workload. For every table that appears in system.query_log SELECT queries, the view computes:
- Pruning % - how much data the primary key skips before reading. Calculated as
1 - (SelectedParts/TotalParts x SelectedMarks/TotalMarks). A score of 95% means the engine only reads 5% of the data. - Poor pruning queries - count of queries where pruning was below 50%
- Parts scanned % - average fraction of parts touched per query
- Avg duration and CPU time - resource cost of queries against this table
Controls:
- Lookback selector (1, 7, or 30 days)
- Database filter with a toggle to include/exclude system databases
- Sort by any column to find the worst offenders
Click a table row to see per-query-pattern breakdowns: which normalized query hashes hit this table, how often, and how efficiently each one prunes. This helps identify specific query shapes that bypass the ordering key.
Query Editor
Write and execute ClickHouse SQL queries with:
- Syntax highlighting and auto-completion
- Time range picker with preset intervals and custom ranges
- Result views - switch between table view and chart visualizations (bar, line, pie, area)
- 3D charts - optional Three.js rendering for bar and pie charts
Presets
Presets are pre-built queries bundled with TraceHouse that cover common monitoring scenarios. They are organized into groups:
- Overview - Database sizes, table sizes, part counts
- Selects - Query throughput, duration, and resource usage
- Inserts - Insert rates, row counts, and latency
- Merges - Merge throughput and backlog
- Parts - Part sizes, levels, and partition distribution
- Resources - CPU, memory, disk I/O trends
Drill-Down Navigation
Charts support click-to-drill: clicking a chart segment navigates to a related query with the clicked value passed as a filter. This enables interactive exploration from high-level overviews down to specific tables, queries, or time ranges.
Correlation
When multiple metrics are overlaid on the same chart, the correlation panel compares their time series using three algorithms:
- Pearson - linear (proportional) correlation. Sensitive to outliers.
- Spearman - rank correlation. Detects monotonic relationships and is robust to outliers. High Spearman + low Pearson suggests a non-linear relationship.
- Cross-correlation - time-shifted correlation. Finds lagged relationships where one metric leads or follows another. Reports the best lag offset.
The system runs all three and compares them to produce an interpretation:
| Pattern | Meaning |
|---|---|
| All three strong | Straightforward linear relationship |
| Spearman strong, Pearson weak | Non-linear monotonic (e.g. exponential growth) |
| Cross-corr strong, Pearson weak | Correlated with a time delay |
| Pearson strong, Spearman weak | Outlier-driven (a few extreme points inflate the score) |
Series with weak overall correlation are checked with a rolling window to find localized bursts where the metrics temporarily move together.
Overlay lines are dimmed by correlation strength: strongly correlated series stay bright, weakly correlated ones fade out, so patterns are visible at a glance.
Surfaces
Requires the experimental features toggle in Settings. We are evaluating which composition of resource metrics, combined on a surface, best exposes critically stressed resources at both the system and table level. The surfaces shown here are still being refined.
The Surfaces tab renders 3D visualizations of table and system activity over time, built from per-minute aggregations of system.query_log and system.part_log. All data is smoothed with a Gaussian kernel to reduce noise while preserving trends.
Stress Surface
Shows multi-dimensional resource intensity for a selected table as a 3D surface. Time runs along the X-axis (per minute), nine resource channels are arrayed along the Z-axis, and the Y-axis (height) shows normalized stress intensity on a scale from Idle to Peak.
The nine resource channels are:
| Channel | Source metric |
|---|---|
| Concurrent Load | query count |
| Avg Latency | mean duration |
| P95 Latency | 95th percentile duration |
| Rows Scanned | total read rows |
| Read I/O | total read bytes |
| Memory | total memory usage |
| CPU Time | total CPU microseconds |
| I/O Wait | total I/O wait microseconds |
| Marks Touched | total selected marks |
Each channel is independently normalized to [0, 1], so the surface reveals the shape of stress across dimensions rather than absolute values. The color scale runs from deep blue (idle) through green and yellow to red and magenta (peak stress).
Insert activity appears as blue markers below the surface, sized by inserted rows. Merge activity appears as orange markers at the far edge, sized by merge count. This makes it easy to spot recurring stress patterns (e.g. a batch job that spikes CPU every hour) or correlate insert bursts with merge storms.
Controls: time range picker (15 minutes to 30 days), database/table selector.
Pattern Surface
Shows the top 12 most time-consuming query patterns for a table. Each pattern (by normalized_query_hash) gets its own lane on the Z-axis, ranked by total duration. Time runs along the X-axis, and the Y-axis (height) shows average duration in milliseconds — this is the actual latency value, not a normalized score.
The color scale uses a magma palette (dark purple to bright yellow) to indicate intensity. A collapsible side panel lists each pattern with its query count, average duration, average memory, and a sample query. Hovering over the surface highlights the corresponding pattern.
This reveals which query shapes dominate a table's workload and how their performance varies over time.
Resource Surface
A system-wide or per-table view of where resources are being consumed. It operates at two levels:
- System level — Time × Tables. Each lane represents a table, ranked by resource consumption. Shows which tables are stressing the system most at any point in time.
- Table level — Time × Query patterns. Drill into a table to see which query patterns drive its resource usage.
By default the height represents a composite stress metric: a weighted combination of CPU, memory, I/O, and marks, each normalized as a percentage of system-wide usage. You can switch to a single metric instead:
- CPU time, Memory, Read I/O, I/O wait, Marks touched, Duration, Query count, Rows read
A side panel lists each lane with a mini stacked bar showing the resource breakdown. Hovering a lane in the panel highlights it on the surface, and vice versa. Click a table lane at the system level to drill into its per-pattern breakdown.
Query Language
Queries use a lightweight meta-comment language (@meta, @chart, @drill, @rag) to configure visualization and navigation. See the Analytics Query Language reference for full details.