docling
RepoDocling is an open-source, C-based PDF extractor with Python bindings that converts documents into structured JSON for high-throughput retrieval-augmented generation pipelines. It is aimed at developers who need very fast, layout-aware text and metadata extraction from large volumes of PDFs.
Stories
Completed digest stories linked to this service.
-
DocLang launches: an AI‑native document standard for enterprise RAG2026-06-17LF AI & Data launched DocLang, an AI-native document spec designed to make business documents machine-readable...