Skip to main content

Query Fingerprinting

Detects semantically equivalent SQL queries by removing literals and hashing a canonical form.


Overview

Query fingerprinting converts SQL into a stable, normalized representation so that logically identical queries map to the same fingerprint.


Why Fingerprinting Matters

Fingerprinting enables:

  • Query deduplication
  • Cache key generation
  • Hot query detection
  • Retry pattern analysis
  • Analytics aggregation

Example

Input:

SELECT * FROM users WHERE id = 10

Normalized:

SELECT * FROM users WHERE id = ?

A hash is then generated from the normalized query.


Tool Usage

./mvnw exec:java -Dexec.mainClass="io.github.tanejagagan.sql.commons.Fingerprint"

Implementation Notes

  • Uses AST-based transformation
  • Ignores literal values
  • Preserves query structure

Production Use

Fingerprinting is used internally for:

  • Metrics aggregation
  • Planning heuristics
  • Observability

Read More:

Deep dive article → https://medium.com/@tanejagagan/detecting-similar-sql-queries-using-duckdb