Xcrap Factory is a set of utilities for dynamically creating instances of clients, extractors, and parsing models, making it easier to configure and extend scraping and parsing pipelines.
Installation is straightforward—just use your favorite dependency manager. Here’s an example using NPM:
npm i @xcrap/factory- createClient: Instantiates clients from a registry of allowed classes.
- createExtractor: Creates extractor functions from configurable text and a registry of allowed extractors.
- createParsingModel: Builds validated and nested parsing models with customizable extractors and types.
import { GotScrapingClient } from "@xcrap/got-scraping-client"
import { AxiosClient } from "@xcrap/axios-client"
import { createClient } from "@xcrap/factory"
const config = {
allowedClients: {
"got-scraping": GotScrapingClient,
"axios": AxiosClient
}
}
const client = createClient({
config: config,
type: "...", // Client type
options: {...} // Client constructor options
})import { extractInnerText, extractSrc, extractHref, extractAttribute } from "@xcrap/parser"
import { createExtractor } from "@xcrap/factory"
const config = {
allowedExtractors: {
innerText: extractInnerText,
src: extractSrc,
href: extractHref,
attribute: extractAttribute // extractAttribute(name: string) -> Generates an extractor
},
argumentSeparator: ":" // Optional | Usage example -> "attribute:value"
}
const extractor = createExtractor({
extractorText: "..", // innerText, src, href, attribute:ATTRIBUTE_NAME...
config: config
})import { HtmlParsingModel, JsonParsingModel } from "@xcrap/parser"
import { createParsingModel } from "@xcrap/factory"
const config = {
allowedExtractors: {...},
extractorArgumentSeparator: "...", // Optional
allowedModels: {
html: HtmlParsingModel,
json: JsonParsingModel
}
}
const parsingModel = createParsingModel({
config: config,
model: {
type: "html", // Model type: html, json..
model: {
title: {
query: "title",
extractor: "innerText",
},
bodyData: { // Nested model
query: "body",
nested: {
type: "html",
model: {
heading: {
query: "h1",
extractor: "innerText"
}
}
}
}
}
}
})Automated tests are located in __tests__. To run them:
npm run test- Want to contribute? Follow these steps:
- Fork the repository.
- Create a new branch (git checkout -b feature-new).
- Commit your changes (git commit -m 'Add new feature').
- Push to the branch (git push origin feature-new).
- Open a Pull Request.
This project is licensed under the MIT License.