Compiler-as-a-Service for adblock filter lists. Transform, optimize, and combine filter lists from multiple sources with real-time progress tracking.
🌐 Try the Web UI | 🚀 API Endpoint | 📚 API Documentation
Note: This is a Deno-native rewrite of the original @adguard/hostlist-compiler. It provides the same functionality with improved performance and no Node.js dependencies.
-
🎯 Multi-Source Compilation - Combine filter lists from URLs, files, or inline rules
-
⚡ Performance - Gzip compression (70-80% cache reduction), request deduplication, smart caching
-
🔄 Circuit Breaker - Automatic retry with exponential backoff for unreliable sources
-
📊 Visual Diff - See what changed between compilations
-
🎪 Batch Processing - Compile up to 10 lists in parallel
-
📡 Event Pipeline - Real-time progress tracking via Server-Sent Events
-
🌍 Universal - Works in Deno, Node.js, Cloudflare Workers, browsers
-
🎨 11 Transformations - Deduplicate, compress, validate, and more
Run directly without installation:
deno run --allow-read --allow-write --allow-net jsr:@jk-com/adblock-compiler -c config.json -o output.txtOr install globally:
deno install --allow-read --allow-write --allow-net -n hostlist-compiler jsr:@jk-com/adblock-compiler/cliClone the repository and compile:
git clone https://github.com/jaypatrick/adblock-compiler.git
cd adblock-compiler
deno task buildThis creates a standalone hostlist-compiler executable.
Quick hosts conversion
Convert and compress a /etc/hosts-syntax blocklist to AdGuard syntax.
hostlist-compiler -i hosts.txt -i hosts2.txt -o output.txtBuild a configurable blocklist from multiple sources
Prepare the list configuration (read more about that below) and run the compiler:
hostlist-compiler -c configuration.json -o output.txtAll command line options
Usage: hostlist-compiler [options]
Options:
--config, -c Path to the compiler configuration file [string]
--input, -i URL (or path to a file) to convert to an AdGuard-syntax
blocklist. Can be specified multiple times. [array]
--input-type, -t Type of the input file (hosts|adblock) [string]
--output, -o Path to the output file [string] [required]
--verbose, -v Run with verbose logging [boolean]
--version Show version number [boolean]
-h, --help Show help [boolean]
Examples:
hostlist-compiler -c config.json -o compile a blocklist and write the
output.txt output to output.txt
hostlist-compiler -i compile a blocklist from the URL and
https://example.org/hosts.txt -o write the output to output.txt
output.txt
Configuration defines your filter list sources, and the transformations that are applied to the sources.
Here is an example of this configuration:
{
"name": "List name",
"description": "List description",
"homepage": "https://example.org/",
"license": "GPLv3",
"version": "1.0.0.0",
"sources": [
{
"name": "Local rules",
"source": "rules.txt",
"type": "adblock",
"transformations": ["RemoveComments", "Compress"],
"exclusions": ["excluded rule 1"],
"exclusions_sources": ["exclusions.txt"],
"inclusions": ["*"],
"inclusions_sources": ["inclusions.txt"]
},
{
"name": "Remote rules",
"source": "https://example.org/rules",
"type": "hosts",
"exclusions": ["excluded rule 1"]
}
],
"transformations": ["Deduplicate", "Compress"],
"exclusions": ["excluded rule 1", "excluded rule 2"],
"exclusions_sources": ["global_exclusions.txt"],
"inclusions": ["*"],
"inclusions_sources": ["global_inclusions.txt"]
}name- (mandatory) the list name.description- (optional) the list description.homepage- (optional) URL to the list homepage.license- (optional) Filter list license.version- (optional) Filter list version.sources- (mandatory) array of the list sources..source- (mandatory) path or URL of the source. It can be a traditional filter list or a hosts file..name- (optional) name of the source..type- (optional) type of the source. It could beadblockfor Adblock-style lists orhostsfor /etc/hosts style lists. If not specified,adblockis assumed..transformations- (optional) a list of transformations to apply to the source rules. By default, no transformations are applied. Learn more about possible transformations here..exclusions- (optional) a list of rules (or wildcards) to exclude from the source..exclusions_sources- (optional) a list of files with exclusions..inclusions- (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included..inclusions_sources- (optional) a list of files with inclusions.
transformations- (optional) a list of transformations to apply to the final list of rules. By default, no transformations are applied. Learn more about possible transformations here.exclusions- (optional) a list of rules (or wildcards) to exclude from the source.exclusions_sources- (optional) a list of files with exclusions..inclusions- (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included..inclusions_sources- (optional) a list of files with inclusions.
Here is an example of a minimal configuration:
{
"name": "test list",
"sources": [
{
"source": "rules.txt"
}
]
}Exclusion and inclusion rules
Please note, that exclusion or inclusion rules may be a plain string, wildcard, or a regular expression.
plainstring- every rule that containsplainstringwill match the rule*.plainstring- every rule that matches this wildcard will match the rule/regex/- every rule that matches this regular expression, will match the rule. By default, regular expressions are case-insensitive.! comment- comments will be ignored.
Important
Ensure that rules in the exclusion list match the format of the rules in the filter list.
To maintain a consistent format, add the Compress transformation to convert /etc/hosts rules to adblock syntax.
This is especially useful if you have multiple lists in different formats.
Here is an example:
Rules in HOSTS syntax: /hosts.txt
0.0.0.0 ads.example.com
0.0.0.0 tracking.example1.com
0.0.0.0 example.comExclusion rules in adblock syntax: /exclusions.txt
||example.com^Configuration of the final list:
{
"name": "List name",
"description": "List description",
"sources": [
{
"name": "HOSTS rules",
"source": "hosts.txt",
"type": "hosts",
"transformations": ["Compress"]
}
],
"transformations": ["Deduplicate", "Compress"],
"exclusions_sources": ["exclusions.txt"]
}Final filter output of /hosts.txt after applying the Compress transformation and exclusions:
||ads.example.com^
||tracking.example1.com^The last rule now ||example.com^ will correctly match the rule from the exclusion list and will be excluded.
Command-line arguments.
Usage: hostlist-compiler [options]
Options:
--version Show version number [boolean]
--config, -c Path to the compiler configuration file [string]
--input, -i URL or path to input file (can be repeated) [array]
--output, -o Path to the output file [string] [required]
--verbose, -v Run with verbose logging [boolean]
-h, --help Show help [boolean]
Examples:
hostlist-compiler -c config.json -o compile a blocklist and write the
output.txt output to output.txt
Import from JSR:
import { compile } from "jsr:@anthropic/hostlist-compiler";Or add to your deno.json:
{
"imports": {
"@anthropic/hostlist-compiler": "jsr:@anthropic/hostlist-compiler"
}
}import { compile } from "@anthropic/hostlist-compiler";
import type { IConfiguration } from "@anthropic/hostlist-compiler";
const config: IConfiguration = {
name: "Your Hostlist",
sources: [
{
type: "adblock",
source: "https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt",
transformations: ["RemoveComments", "Validate"],
},
],
transformations: ["Deduplicate"],
};
// Compile filters
const result = await compile(config);
// Write to file
await Deno.writeTextFile("your-hostlist.txt", result.join("\n"));import { FilterCompiler, ConsoleLogger } from "@anthropic/hostlist-compiler";
import type { IConfiguration } from "@anthropic/hostlist-compiler";
const logger = new ConsoleLogger();
const compiler = new FilterCompiler(logger);
const config: IConfiguration = {
name: "Your Hostlist",
sources: [
{
source: "rules.txt",
type: "hosts",
},
],
transformations: ["Compress", "Deduplicate"],
};
const result = await compiler.compile(config);
console.log(`Compiled ${result.length} rules`);Here is the full list of transformations that are available:
RemoveCommentsCompressRemoveModifiersValidateValidateAllowIpDeduplicateInvertAllowRemoveEmptyLinesTrimLinesInsertFinalNewLine
Please note that these transformations are are always applied in the order specified here.
This is a very simple transformation that simply removes comments (e.g. all rules starting with ! or #).
Important
This transformation converts hosts lists into adblock lists.
Here's what it does:
- It converts all rules to adblock-style rules. For instance,
0.0.0.0 example.orgwill be converted to||example.org^. - It discards the rules that are now redundant because of other existing rules. For instance,
||example.orgblocksexample.organd all it's subdomains, therefore additional rules for the subdomains are now redundant.
By default, AdGuard Home will ignore rules with unsupported modifiers, and all of the modifiers listed here are unsupported. However, the rules with these modifiers are likely to be okay for DNS-level blocking, that's why you might want to remove them when importing rules from a traditional filter list.
Here is the list of modifiers that will be removed:
$third-partyand$3pmodifiers$documentand$docmodifiers$allmodifier$popupmodifier$networkmodifier
Caution
Blindly removing $third-party from traditional ad blocking rules leads to lots of false-positives.
This is exactly why there is an option to exclude rules - you may need to use it.
This transformation is really crucial if you're using a filter list for a traditional ad blocker as a source.
It removes dangerous or incompatible rules from the list.
So here's what it does:
- Discards domain-specific rules (e.g.
||example.org^$domain=example.com). You don't want to have domain-specific rules working globally. - Discards rules with unsupported modifiers. Click here to learn more about which modifiers are supported.
- Discards rules that are too short.
- Discards IP addresses. If you need to keep IP addresses, use ValidateAllowIp instead.
- Removes rules that block entire top-level domains (TLDs) like
||*.org^, unless they have specific limiting modifiers such as$denyallow,$badfilter, or$client. Examples:||*.org^- this rule will be removed||*.org^$denyallow=example.com- this rule will be kept because it has a limiting modifier
If there are comments preceding the invalid rule, they will be removed as well.
This transformation exactly repeats the behavior of Validate, but leaves the IP addresses in the lists.
This transformation simply removes the duplicates from the specified source.
There are two important notes about this transformation:
- It keeps the original rules order.
- It ignores comments. However, if the comments precede the rule that is being removed, the comments will be also removed.
For instance:
! rule1 comment 1
rule1
! rule1 comment 2
rule1
Here's what will be left after the transformation:
! rule1 comment 2
rule1
This transformation converts blocking rules to "allow" rules. Note, that it does nothing to /etc/hosts rules (unless they were previously converted to adblock-style syntax by a different transformation, for example Compress).
There are two important notes about this transformation:
- It keeps the original rules order.
- It ignores comments, empty lines, /etc/hosts rules and existing "allow" rules.
Example:
Original list:
! comment 1
rule1
# comment 2
192.168.11.11 test.local
@@rule2
Here's what we will have after applying this transformation:
! comment 1
@@rule1
# comment 2
192.168.11.11 test.local
@@rule2
This is a very simple transformation that removes empty lines.
Example:
Original list:
rule1
rule2
rule3
Here's what we will have after applying this transformation:
rule1
rule2
rule3
This is a very simple transformation that removes leading and trailing spaces/tabs.
Example:
Original list:
rule1
rule2
rule3
rule4
Here's what we will have after applying this transformation:
rule1
rule2
rule3
rule4
This is a very simple transformation that inserts a final newline.
Example:
Original list:
rule1
rule2
rule3
Here's what we will have after applying this transformation:
rule1
rule2
rule3
RemoveEmptyLines doesn't delete this empty row due to the execution order.
This transformation converts all non-ASCII characters to their ASCII equivalents. It is always performed first.
Example:
Original list:
||*.рус^
||*.कॉम^
||*.セール^
Here's what we will have after applying this transformation:
||*.xn--p1acf^
||*.xn--11b4c3d^
||*.xn--1qqw23a^
AdBlock Compiler is designed to be fully extensible. You can:
- Create custom transformations - Extend
SyncTransformationorAsyncTransformationto add custom rule processing - Implement custom fetchers - Support any protocol or data source by implementing
IContentFetcher - Build custom compilers - Extend
FilterCompilerorWorkerCompilerfor specialized use cases - Integrate custom loggers - Implement
ILoggerto integrate with your logging system - Add event handlers - Implement
ICompilerEventsfor custom monitoring and tracking
Example: Custom Transformation
import { SyncTransformation, TransformationType, TransformationRegistry } from '@jk-com/adblock-compiler';
class RemoveSocialMediaTransformation extends SyncTransformation {
public readonly type = 'RemoveSocialMedia' as TransformationType;
public readonly name = 'Remove Social Media';
private socialDomains = ['facebook.com', 'twitter.com', 'instagram.com'];
public executeSync(rules: string[]): string[] {
return rules.filter(rule => {
return !this.socialDomains.some(domain => rule.includes(domain));
});
}
}
// Register and use
const registry = new TransformationRegistry();
registry.register('RemoveSocialMedia' as any, new RemoveSocialMediaTransformation());
const compiler = new FilterCompiler({ transformationRegistry: registry });Example: Custom Fetcher
import { IContentFetcher, CompositeFetcher, HttpFetcher } from '@jk-com/adblock-compiler';
class DatabaseFetcher implements IContentFetcher {
async canHandle(source: string): Promise<boolean> {
return source.startsWith('db://');
}
async fetchContent(source: string): Promise<string> {
const [table, column] = source.replace('db://', '').split('/');
// Your database query implementation
return await queryDatabase(table, column);
}
}
// Use with CompositeFetcher
const fetcher = new CompositeFetcher([
new DatabaseFetcher(),
new HttpFetcher(),
]);📚 For complete extensibility examples and patterns, see docs/EXTENSIBILITY.md
Topics covered:
- Custom transformations (sync and async)
- Custom content fetchers
- Custom event handlers
- Custom loggers
- Extending the compiler
- Plugin systems
- Best practices
- Deno 2.0 or later
# Run in development mode with watch
deno task dev
# Run the compiler
deno task compile
# Build standalone executable
deno task build
# Run tests
deno task test
# Run tests in watch mode
deno task test:watch
# Run tests with coverage
deno task test:coverage
# Lint code
deno task lint
# Format code
deno task fmt
# Check formatting
deno task fmt:check
# Type check
deno task check
# Cache dependencies
deno task cachesrc/
├── cli/ # Command-line interface
├── compiler/ # Core compilation logic
├── configuration/ # Configuration validation
├── downloader/ # Filter list downloading
├── platform/ # Platform abstraction layer (Workers, browsers)
├── transformations/ # Rule transformations
├── types/ # TypeScript type definitions
├── utils/ # Utility functions
├── index.ts # Main library exports
└── mod.ts # Deno module exports
examples/
└── cloudflare-worker/ # Cloudflare Worker deployment example
# Dry run to verify everything is correct
deno publish --dry-run
# Publish to JSR
deno publishThe hostlist-compiler includes a platform abstraction layer that enables running in any JavaScript runtime, including:
- Deno (default)
- Node.js (via npm compatibility)
- Cloudflare Workers
- Deno Deploy
- Vercel Edge Functions
- AWS Lambda@Edge
- Web Workers (browser)
- Browsers (with server-side proxy for CORS)
The platform layer is designed to be pluggable - you can easily add or remove fetchers without modifying the core compiler.
The platform layer provides:
WorkerCompiler- A platform-agnostic compiler that works without file system accessPreFetchedContentFetcher- Supply source content directly instead of fetching from URLsHttpFetcher- Standard Fetch API-based content fetching (works everywhere)CompositeFetcher- Chain multiple fetchers together (pre-fetched takes priority)PlatformDownloader- Handles!#includedirectives and conditional compilation
The WorkerCompiler works in any edge runtime or serverless environment that supports the standard Fetch API. The pattern is the same across all platforms:
- Pre-fetch source content on the server (avoids CORS and network restrictions)
- Pass content to the compiler via
preFetchedContent - Configure and compile using the standard API
import {
WorkerCompiler,
HttpFetcher,
PreFetchedContentFetcher,
CompositeFetcher,
type IConfiguration,
} from '@anthropic/hostlist-compiler';
// Option 1: Use pre-fetched content (recommended for edge)
async function compileWithPreFetched(sourceUrls: string[]): Promise<string[]> {
// Fetch all sources
const preFetched = new Map<string, string>();
for (const url of sourceUrls) {
const response = await fetch(url);
preFetched.set(url, await response.text());
}
const compiler = new WorkerCompiler({ preFetchedContent: preFetched });
const config: IConfiguration = {
name: 'My Filter List',
sources: sourceUrls.map(url => ({ source: url })),
transformations: ['Deduplicate', 'RemoveEmptyLines'],
};
return compiler.compile(config);
}
// Option 2: Build a custom fetcher chain
function createCustomCompiler() {
const preFetched = new PreFetchedContentFetcher(new Map([
['local://rules', 'my-custom-rule'],
]));
const http = new HttpFetcher();
const composite = new CompositeFetcher([preFetched, http]);
return new WorkerCompiler({ customFetcher: composite });
}The compiler runs natively in Cloudflare Workers. See the examples/cloudflare-worker directory for a complete example with SSE streaming.
Deployment: A wrangler.toml configuration file is provided in the repository root for easy deployment via Cloudflare's Git integration or using wrangler deploy.
import { WorkerCompiler, type IConfiguration } from '@anthropic/hostlist-compiler';
export default {
async fetch(request: Request): Promise<Response> {
// Pre-fetch content on the server where there are no CORS restrictions
const sourceContent = await fetch('https://example.com/filters.txt').then(r => r.text());
const compiler = new WorkerCompiler({
preFetchedContent: {
'https://example.com/filters.txt': sourceContent,
},
});
const configuration: IConfiguration = {
name: 'My Filter List',
sources: [
{ source: 'https://example.com/filters.txt' },
],
transformations: ['Deduplicate', 'RemoveEmptyLines'],
};
const result = await compiler.compile(configuration);
return new Response(result.join('\n'), {
headers: { 'Content-Type': 'text/plain' },
});
},
};The Cloudflare Worker example includes:
- JSON API endpoint for programmatic access
- Server-Sent Events (SSE) streaming for real-time progress
- Pre-fetched content support to bypass CORS restrictions
- Benchmarking metrics support
Use WorkerCompiler in Web Workers for background compilation:
// worker.ts
import { WorkerCompiler, type IConfiguration } from '@anthropic/hostlist-compiler';
self.onmessage = async (event) => {
const { configuration, preFetchedContent } = event.data;
const compiler = new WorkerCompiler({
preFetchedContent,
events: {
onProgress: (progress) => {
self.postMessage({ type: 'progress', progress });
},
},
});
const result = await compiler.compile(configuration);
self.postMessage({ type: 'complete', rules: result });
};For browser environments, pre-fetch all source content server-side to avoid CORS issues:
import { WorkerCompiler, type IConfiguration } from '@anthropic/hostlist-compiler';
// Fetch sources through your server proxy to avoid CORS
async function fetchSources(urls: string[]): Promise<Map<string, string>> {
const content = new Map<string, string>();
for (const url of urls) {
const response = await fetch(`/api/proxy?url=${encodeURIComponent(url)}`);
content.set(url, await response.text());
}
return content;
}
// Usage
const sources = await fetchSources([
'https://example.com/filters.txt',
]);
const compiler = new WorkerCompiler({
preFetchedContent: Object.fromEntries(sources),
});
const configuration: IConfiguration = {
name: 'Browser Compiled List',
sources: [
{ source: 'https://example.com/filters.txt' },
],
};
const rules = await compiler.compile(configuration);interface WorkerCompilerOptions {
// Pre-fetched content (Map or Record)
preFetchedContent?: Map<string, string> | Record<string, string>;
// Custom content fetcher (for advanced use cases)
customFetcher?: IContentFetcher;
// Compilation event handlers
events?: ICompilerEvents;
// Logger instance
logger?: ILogger;
}
class WorkerCompiler {
constructor(options?: WorkerCompilerOptions);
// Compile and return rules
compile(configuration: IConfiguration): Promise<string[]>;
// Compile with optional benchmarking metrics
compileWithMetrics(
configuration: IConfiguration,
benchmark?: boolean
): Promise<WorkerCompilationResult>;
}Implement this interface to create custom content fetchers:
interface IContentFetcher {
canHandle(source: string): boolean;
fetch(source: string): Promise<string>;
}You can implement custom fetchers for specialized use cases:
import {
WorkerCompiler,
CompositeFetcher,
HttpFetcher,
type IContentFetcher,
} from '@anthropic/hostlist-compiler';
// Example: Redis-backed cache fetcher
class RedisCacheFetcher implements IContentFetcher {
constructor(private redis: RedisClient, private ttl: number) {}
canHandle(source: string): boolean {
return source.startsWith('http://') || source.startsWith('https://');
}
async fetch(source: string): Promise<string> {
const cached = await this.redis.get(`filter:${source}`);
if (cached) return cached;
const response = await fetch(source);
const content = await response.text();
await this.redis.setex(`filter:${source}`, this.ttl, content);
return content;
}
}
// Example: S3/R2-backed storage fetcher
class S3StorageFetcher implements IContentFetcher {
constructor(private bucket: S3Bucket) {}
canHandle(source: string): boolean {
return source.startsWith('s3://');
}
async fetch(source: string): Promise<string> {
const key = source.replace('s3://', '');
const object = await this.bucket.get(key);
return object?.text() ?? '';
}
}
// Chain fetchers together - first match wins
const compiler = new WorkerCompiler({
customFetcher: new CompositeFetcher([
new RedisCacheFetcher(redis, 3600),
new S3StorageFetcher(bucket),
new HttpFetcher(),
]),
});This pluggable architecture allows you to:
- Add caching layers (Redis, KV, memory)
- Support custom protocols (S3, R2, database)
- Implement authentication/authorization
- Add logging and metrics
- Mock sources for testing
MIT