Uploading Documents
Upload PDF files, Word documents, CSV files, or text files to your Knowledge Base. Your agent will learn from these documents and use them to answer customer questions.What File Types Work
Documents:- PDF files (.pdf)
- Word documents (.docx)
- Text files (.txt)
- Markdown files (.md)
- CSV files (.csv)
- Policies and guidelines
- Product catalogs
- FAQ documents
- Support guides
- Price lists
- Specifications
Upload Steps
1. Go to Knowledge Base- Click your agent name
- Click “Knowledge Base” in the sidebar
- Button is at the top
- Or drag and drop files directly
- Select from your computer
- Multiple files at once (optional)
- Example: “Return Policy 2024”
- Or use the default filename
- Agent processes the document (1-2 minutes for typical files)
- Status shows “Uploading” then “Ready”
- Document is now in your Knowledge Base
- Agent can immediately use it to answer questions
Tips for Success
Clear Names- Use descriptive names: “FAQ - Shipping 2024”
- Not: “doc_final_v3_REAL.pdf”
- Use headers and sections
- Helps agent find answers faster
- Example: ”### Return Policy”, ”### Shipping Info”
- Update prices when they change
- Remove outdated policies
- Delete old versions
- Upload 1-2 documents first
- Test agent answers in Chatlab
- Add more as needed
File Size Limits
- Single file: Up to 50MB
- Total storage: Depends on your plan
- Most documents under 5MB anyway
What Happens Next
After upload:- Processing - Agent reads and understands document (usually instant)
- Indexed - Document becomes searchable by agent
- Ready to use - Agent can answer questions using this content
Troubleshooting
Upload failed- Check file isn’t corrupted
- Try different file format (PDF instead of Word)
- Ensure file isn’t password protected
- Check document has clear headings
- Information might be embedded in images (upload text instead)
- Use Train Your Agent to test
Next: Add Websites & URLs to crawl your website automatically Document Type/Category: Classify your document for better organization
- Support/FAQ
- Product Documentation
- Company Policies
- Technical Guides
- Marketing Materials
- Default (Recommended): 300-500 tokens per chunk
- Small Chunks: 200-300 tokens (for detailed Q&A style content)
- Large Chunks: 500-800 tokens (for narrative content that needs more context)
- Tags help with organization and filtering
- Example tags: “2024”, “premium-tier”, “technical”, “customer-facing”
Step 5: Upload and Process
- Review your settings
- Click “Upload” or “Start Processing”
- Watch the progress bar as your file uploads
- Processing begins automatically after upload completes
Step 6: Verify Upload Success
After processing completes (usually 30 seconds to 5 minutes depending on file size):- Check the status indicator: Look for a green checkmark or “Active” status
- Preview the content: Click on the document to see how it was processed
- View chunks: See how the document was split into searchable pieces
- Test in Chatlab: Ask questions related to the document to verify it’s working
File Size Limits and Batch Upload
Single File Upload
- Maximum file size: 10MB per file (for most accounts)
- Larger files: Contact support for enterprise limits up to 50MB
- Processing time: Approximately 1 minute per MB
Batch Upload
Upload multiple documents simultaneously to save time:- Select “Batch Upload” option
- Choose multiple files (up to 20 at once)
- Configure shared settings (applied to all files)
- Upload and process all files together
Storage Limits
- Starter Plan: 100MB total storage
- Professional Plan: 500MB total storage
- Enterprise Plan: Custom storage limits
- Current usage: Check your dashboard to see available space
Document Processing Explained
Understanding what happens during processing helps you troubleshoot and optimize:The Processing Pipeline
-
Text Extraction: ChatCrafterAI extracts readable text from your document
- For PDFs: OCR is applied if needed for scanned documents
- For Word docs: Formatting is preserved where relevant
- For CSV: Data is structured appropriately
-
Structure Analysis: The system identifies headers, sections, and organization
- Headings become important markers for chunking
- Tables and lists are preserved
- Page breaks and logical sections are noted
-
Chunking: The document is split into searchable pieces (see Chunking guide)
- Respects document structure
- Maintains context across chunks
- Creates overlaps to prevent information loss
-
Embedding Generation: Each chunk is converted to a mathematical vector representation (see Embeddings guide)
- Enables semantic search
- Captures meaning beyond keywords
- Powers the intelligent search capabilities
-
Indexing: Processed chunks are added to the searchable database
- Immediately available for agent queries
- Integrated with existing Knowledge Base content
- Ready for customer questions
Processing Time Expectations
- 1-page document: 10-30 seconds
- 10-page document: 1-2 minutes
- 50-page document: 3-5 minutes
- Large manual (200+ pages): 10-20 minutes
Best Practices for Document Upload
1. Organize by Topic
Instead of uploading everything into one massive Knowledge Base, create separate, focused collections: Good approach:- Product Documentation KB (for technical users)
- Customer Support KB (for general questions)
- Internal Policies KB (for employee-facing agent)
2. Use Clear, Descriptive Headings
The AI uses your document structure to understand content: Good document structure:3. Include Metadata and Context
Add dates, versions, and categories to your documents: Example:- Document name: “Shipping Policy - Updated January 2024”
- Tags: “2024”, “policies”, “shipping”, “customer-facing”
- Description: “Current shipping times, costs, and international options”
4. Remove Duplicates Before Uploading
Check if content already exists in your Knowledge Base:- Duplicate content can confuse the search system
- The agent might return multiple similar chunks
- Wastes storage space
5. Clean Up Document Formatting
Before uploading, ensure documents are clean:- Remove unnecessary headers/footers from every page
- Delete navigation elements (if from a web export)
- Remove “Page X of Y” markers
- Clean up any OCR errors in scanned documents
Document Organization Examples
Example 1: Support FAQ Document Structure
- Find specific Q&A pairs quickly
- Understand the category of each question
- Provide complete, focused answers
Example 2: Product Manual Structure
- Progressive disclosure (basics to advanced)
- Easy troubleshooting lookups
- Feature-specific queries
Troubleshooting Common Upload Issues
Issue: File Too Large
Symptoms: Upload fails with “File size exceeds limit” error Solutions:- Compress PDF files (many tools available online)
- Split large documents into smaller sections
- Remove high-resolution images if not needed
- Upgrade to a plan with larger file limits
Issue: Format Not Supported
Symptoms: “Unsupported file format” error Solutions:- Convert to PDF (universally supported)
- Export from your original application to a supported format
- Copy content to a Word or text document
- Check that file extension matches actual file type
Issue: Processing Failed
Symptoms: Upload completes but processing shows error Solutions:- Check if PDF is password-protected (remove password first)
- Verify document isn’t corrupted (try opening it on your computer)
- Look for special characters in filename (use simple names)
- Try re-uploading the file
- Contact support if issue persists with specific file
Issue: Extracted Text Looks Wrong
Symptoms: Preview shows garbled or missing text Solutions:- For scanned PDFs: Use better OCR software before uploading
- Export document to text format first, then upload text file
- Manually copy-paste content into a new document
- Check if document uses special fonts or encoding
After Upload: Next Steps
Once your documents are successfully uploaded:- Test in Chatlab: Ask questions that should be answered by your uploaded content
- Review chunks: Check how the document was split (in document details)
- Refine if needed: Re-upload with better structure if results aren’t optimal
- Add related documents: Upload complementary content for comprehensive coverage
- Set up updates: Create a schedule for updating time-sensitive documents
Related Resources
- Document Chunking for Better Search - Optimize how your documents are split
- Understanding Embeddings and Vector Search - Learn how search works
- Knowledge Base Troubleshooting - Solve common issues