Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint
$ man content-simhash

/content-simhash

agentutility / wordmint / content-simhash
PRICE / CALL
$0.001
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
wordmint
CATEGORY
uncategorized
STATUS
live
NAME
content-simhash simhash / 64-bit content fingerprint / near-duplicate detection / dedup hashing / locality-sensitive hash
SYNOPSIS
POST https://x402.agentutility.ai/content-simhash
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

SimHash / 64-bit content fingerprint / near-duplicate detection / dedup hashing / locality-sensitive hash. Pure-local 64-bit SimHash over token-level k-shingles (default k=3) using FNV-1a. Two SimHashes are 'close' (small Hamming distance) iff the underlying texts share many shingles. Returns hex + decimal forms plus token + shingle counts. Useful for content dedup pipelines, plagiarism detection, and bot-content clustering.

INPUTrequest schema
propertytypedescriptionreq?
textstringText to hash. Up to 500,000 chars.required
shingle_sizenumberk-gram size for shingles. Range [1, 8]. Default 3.optional
OUTPUTresponse shape
fieldtypedescription
hash_hexstring64-bit SimHash fingerprint as a 16-character lowercase hex string.
hash_intstringSame 64-bit SimHash rendered as a decimal integer string (safe for languages without u64).
bit_countstringNumber of set bits (popcount) in the SimHash, useful as a quick sanity check.
token_countstringNumber of tokens extracted from the input text before shingling.
shingle_countstringNumber of distinct k-shingles hashed into the SimHash.
shingle_sizestringShingle width k used (tokens per shingle), default 3.
text_charsstringCharacter length of the input text that was hashed.
sourcestringEchoes how the text was supplied (e.g. inline text vs fetched URL).
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/content-simhash \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# MCP packages on npm under
# @agentutility/mcp-*  (one per cluster)
#
# Catalog + install:
# https://mcp.agentutility.ai
#
# Or call content-simhash directly over HTTP — see above.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
wordmintcontent-hashingsimhashnear-duplicate-detectiondedupfingerprintinglocality-sensitive-hashshingling
methods
POST
cluster
wordmint
price
$0.001 USDC per call
ADJACENTother endpoints in wordmint
endpointdescriptionprice
hash-stringString hasher / multi-algorithm digest / cache-key generator / content fingerprinter / SHA-256 / SHA-1 / SHA-384 / SHA-512 / MD5.$0.001
slugifyURL slug generator / slugifier / canonical-identifier maker / safe-string converter / SEO slug builder / filename slug / cache-key normal…$0.001
text-normalizeText normalize.$0.001
token-countToken count / tokenizer estimate / GPT-4 token count / Claude token count / Gemini token count / context-window pre-flight.$0.001
type-inference-from-jsonType inference from JSON / JSON to TypeScript / JSON to Zod / JSON to JSON Schema / JSON shape inferer / quicktype-style type generator.$0.001
unicode-normalizeUnicode normalize / NFC NFD NFKC NFKD / homoglyph detection / IDN spoof / lookalike chars / invisible characters / zero-width / phishing…$0.001
cron-explainCron expression explainer / cron parser / scheduling translator.$0.002
cron-parseCron parser.$0.002
SEE ALSO
agentutility · wordmint · x402 · mcp · llms.txt · registry.json · bazaar.x402.org