Skip to content

使用 apache poi 和 pdfbox 进行 doc, docx, pdf, xls, xlsx, ppt, pptx 等的文字提取,使用 Hanlp 进行内容关键字的提取

Notifications You must be signed in to change notification settings

Dadagum/document-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

About

使用 apache poi 和 pdfbox 进行 doc, docx, pdf, xls, xlsx, ppt, pptx 等的文字提取,使用 Hanlp 进行内容关键字的提取

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages