Skip to main content
POST
/
api
/
extractor
/
v1
/
extract-async
curl -X POST https://api.appliedai.club/api/extractor/v1/extract-async \
  -H "Authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "url": "https://example.com/invoice.pdf"
    },
    "extraction_schema": {
      "fields": [
        {"name": "invoice_number", "type": "string", "description": "Invoice number from the document"},
        {"name": "total_amount", "type": "number", "description": "Total amount from the invoice"}
      ],
      "document_description": "Invoice document"
    },
    "callback_url": "https://your-server.com/webhook"
  }'
{
  "task_id": "<string>",
  "status": "processing",
  "message": "<string>"
}
This API requires authentication. See our Authentication Guide for details.

Features

  • Supports PDF and image files (jpg, jpeg, png)
  • Accepts URLs or base64-encoded file content
  • Maximum file size: 10MB
  • Supports both direct schema and template-based extraction
  • Returns immediately with a task_id for status tracking
  • Optional callback URL for webhook notification when processing completes

Request Body

document
object
Document to process (either URL or file content)
document.url
string
URL of the document to process
document.file_content
string
Base64 encoded file content
extraction_schema
object
Extraction schema definition
template_id
string
ID of a saved extraction template
callback_url
string
Optional URL for webhook notification when processing completes
curl -X POST https://api.appliedai.club/api/extractor/v1/extract-async \
  -H "Authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document": {
      "url": "https://example.com/invoice.pdf"
    },
    "extraction_schema": {
      "fields": [
        {"name": "invoice_number", "type": "string", "description": "Invoice number from the document"},
        {"name": "total_amount", "type": "number", "description": "Total amount from the invoice"}
      ],
      "document_description": "Invoice document"
    },
    "callback_url": "https://your-server.com/webhook"
  }'

Body

application/json

Request model for asynchronous extraction

document_url
string | null
file_content
string | null
extraction_schema
object | null
template_id
string | null
provider
enum<string> | null
Available options:
openai,
anthropic,
gemini
document_description
string | null
callback_url
string | null

Response

Successful Response

Response model for asynchronous extraction request

task_id
string
required
status
enum<string>
required
Available options:
processing,
retrying,
completed,
failed
message
string
required
I