Self-hosted Nostr indexer for NIP-35 torrent events with federated curation
Guide to operating Lighthouse as a content curator.
A curator is a trusted entity that:
Curators are the “human filters” in the Web of Trust model.
trust:
depth: 0 # Whitelist only
indexer:
tag_filter_enabled: true
tag_filter:
- movies
- tv
Import or Create Rulesets
Rulesets define moderation policies. There are two types:
Deterministic blocking rules. Rejections always take priority.
| Code | Description |
|---|---|
LEGAL_DMCA |
Documented DMCA takedown |
LEGAL_ILLEGAL |
Manifestly illegal content |
ABUSE_SPAM |
Spam or flooding |
ABUSE_MALWARE |
Malicious content |
Quality and classification rules. Aggregation policy applies.
| Code | Description |
|---|---|
SEM_DUPLICATE_EXACT |
Exact duplicate (same infohash) |
SEM_DUPLICATE_PROBABLE |
Same files, different event |
SEM_BAD_META |
Incomplete metadata |
SEM_LOW_QUALITY |
Below quality threshold |
SEM_CATEGORY_MISMATCH |
Wrong category |
{
"name": "Movie Curation Ruleset",
"type": "semantic",
"version": "1.0.0",
"description": "Quality rules for movie torrents",
"author": "npub1...",
"rules": [
{
"id": "min-size",
"name": "Minimum Size",
"type": "threshold",
"field": "size",
"operator": "gte",
"value": 100000000,
"reason_code": "SEM_LOW_QUALITY",
"priority": 10
}
]
}
Match against text fields.
{
"type": "pattern",
"field": "name",
"operator": "contains",
"value": "CAM",
"reason_code": "SEM_LOW_QUALITY"
}
Operators: equals, contains, starts_with, ends_with, regex
Compare numeric values.
{
"type": "threshold",
"field": "size",
"operator": "gte",
"value": 100000000,
"reason_code": "SEM_LOW_QUALITY"
}
Operators: eq, ne, gt, gte, lt, lte
Match against lists.
{
"type": "list",
"field": "tags",
"operator": "contains_any",
"value": ["spam", "fake"],
"reason_code": "ABUSE_SPAM"
}
Operators: contains_any, contains_all, contains_none
| Field | Type | Description |
|---|---|---|
name |
string | Torrent name |
size |
integer | Size in bytes |
category |
integer | Torznab category |
tags |
array | Nostr tags |
pubkey |
string | Publisher’s npub |
created_at |
timestamp | Publication time |
curl -X POST http://localhost:9999/api/rulesets \
-H "X-API-Key: your-key" \
-H "Content-Type: application/json" \
-d @ruleset.json
curl -X POST http://localhost:9999/api/rulesets/import \
-H "X-API-Key: your-key" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/ruleset.json"}'
When content is processed, a decision is created:
{
"decision_id": "abc123",
"decision": "accept",
"reason_codes": [],
"ruleset_type": "semantic",
"ruleset_version": "1.0.0",
"ruleset_hash": "sha256...",
"target_event_id": "nostr_event_id",
"target_infohash": "aabbccdd...",
"curator_pubkey": "npub1...",
"signature": "ed25519_sig...",
"created_at": "2024-01-01T00:00:00Z"
}
Content → Censoring Rules → Semantic Rules → Decision → Sign → Publish
Decisions are signed with your nsec:
signature = Ed25519.Sign(privateKey, decisionHash)
The signature proves:
Decisions can be published to Nostr relays.
curator:
publish_decisions: true
publish_relays:
- "wss://relay.example.com"
Decisions are published as Kind 30172 events:
{
"kind": 30172,
"content": "{decision_json}",
"tags": [
["d", "decision_id"],
["infohash", "aabbccdd..."],
["decision", "accept"],
["ruleset", "semantic", "1.0.0", "sha256..."]
],
"pubkey": "curator_pubkey",
"sig": "signature"
}
Curators receive reports from users.
User submits report → Report queued → Curator reviews → Decision updated
| Category | Description |
|---|---|
dmca |
DMCA takedown request |
illegal |
Illegal content |
spam |
Spam or flooding |
malware |
Malicious content |
false_info |
Incorrect metadata |
duplicate |
Duplicate content |
other |
Other issues |
Users can appeal decisions.
User submits appeal → Appeal queued → Curator reviews → Decision reconsidered
Consider focusing on specific content:
{
"name": "Basic Quality",
"type": "semantic",
"version": "1.0.0",
"rules": [
{
"id": "min-size",
"name": "Minimum Size (100MB)",
"type": "threshold",
"field": "size",
"operator": "gte",
"value": 100000000,
"reason_code": "SEM_LOW_QUALITY"
},
{
"id": "no-cam",
"name": "No CAM Releases",
"type": "pattern",
"field": "name",
"operator": "regex",
"value": "\\bCAM\\b",
"reason_code": "SEM_LOW_QUALITY"
}
]
}
{
"name": "Anti-Spam",
"type": "censoring",
"version": "1.0.0",
"rules": [
{
"id": "spam-tags",
"name": "Spam Tags",
"type": "list",
"field": "tags",
"operator": "contains_any",
"value": ["spam", "fake", "virus"],
"reason_code": "ABUSE_SPAM"
},
{
"id": "spam-names",
"name": "Spam Names",
"type": "pattern",
"field": "name",
"operator": "regex",
"value": "(\\bFREE\\b.*\\bDOWNLOAD\\b|\\bCLICK\\s+HERE\\b)",
"reason_code": "ABUSE_SPAM"
}
]
}
Track your curation effectiveness:
| Metric | Description |
|---|---|
| Decisions/day | Volume processed |
| Accept rate | % accepted |
| Appeal rate | % decisions appealed |
| Overturn rate | % appeals resulting in change |