๐ PageIndex API Documentation
Welcome to the PageIndex API documentation. PageIndex converts PDF documents into structured hierarchical tree representations optimized for agentic RAG. Learn more here (opens in a new tab).
See an example of the output structure generated by PageIndex from the example PDF document (2023 Annual Report of the Board of Governors of the Federal Reserve System).
If you have any questions, please join our Discord community (opens in a new tab).
๐ API Key
Get your API key here (opens in a new tab).
๐ Quickstart
The PageIndex API consists of two main endpoints:
- Submit Endpoint (
/pageindex
): Submit a PDF document for processing. - Status Endpoint (
/pageindex/status
): Check processing status and retrieve results when completed.
Python Usage Example
import requests
# Submit PDF for processing
with open('./2023-annual-report.pdf', 'rb') as f:
submit_response = requests.post(
"https://api.vectify.ai/pageindex",
headers={'api_key': 'YOUR_API_KEY_HERE'},
files={'file': f}
)
task_id = submit_response.json()["task_id"]
# Check processing status
status_response = requests.post(
"https://api.vectify.ai/pageindex/status",
headers={'api_key': 'YOUR_API_KEY_HERE'},
json={"task_id": task_id}
)
status_data = status_response.json()
# Retrieve results when processing is complete
if status_data["status"] == "completed":
print("Tree Structure Result:", status_data["result"])
๐ Notes
- Currently supports PDF files only.
- Future updates will include additional document formats, improved parsing, and enhanced database integration.
For support or feedback, please join our Discord community (opens in a new tab).