AI Documents (PDF) Support


Feature Name	AI Documents (PDF)
Feature ID	`CrestApps.OrchardCore.AI.Documents.Pdf`

Extends the 'AI Documents' feature by allowing PDF file.

Overview

This module extends the AI Documents feature with PDF document support.

Features

PDF Text Extraction: Extract text content from PDF documents
Page-by-Page Processing: Text is extracted from each page of the PDF

Getting Started

Enable the AI Documents (PDF) feature in Orchard Core admin
Upload PDF files in the Documents tab of your chat interactions
Text content will be automatically extracted and used for RAG

Technical Details

This module uses the PdfPig library for PDF text extraction via an IngestionDocumentReader implementation. PdfPig is a fully open-source PDF library that:

Extracts text content from PDF documents
Does not require any external dependencies
Works cross-platform

Limitations

Scanned PDFs: Scanned documents that contain images of text (not actual text) will not be extracted correctly. For best results, use PDFs with actual text content.
Complex Layouts: Some complex PDF layouts may not preserve exact text formatting.

Supported File Types

Extension	MIME Type
.pdf	application/pdf

Overview​

Features​

Getting Started​

Technical Details​

Limitations​

Supported File Types​