Change the repository type filter
All
Repositories list
55 repositories
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
- MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG applications, and training data generation.
- Data browser based on s3. 一个基于 S3 的数据(json / jsonl / parquet / html / md等)可视化工具。👇 Try online.
labelU
PublicUniMERNet
Public- SDK of OpenDataLab - https://opendatalab.org.cn
MLS-BRN
Publicopendatalab-datasets
Public.github
Public