Skip to content Skip to sidebar Skip to footer

Claude 3.5 Sonnet is now capable of analysing PDF files with images and tables

Anthopicโ€™s Claude 3.5 Sonnet can now analyse PDF with greater emphasis on text and detailed visuals like charts and tables. The AI model does this in a three-step manner โ€“ text extraction, dual-layer analysis, and page-to-image extraction. In simple words, the model extracts text from the PDF document and converts each page into images to analyse them. This allows users to gather insights about the visual elements in the document.

Till a few days ago, when a user uploaded a PDF file to Claude.ai, the model used text extraction and later used that as a prompt in the AI model. Now, the model can see PDFs visually alongside the text allowing it to accurately comprehend complex documents, especially those with numerous charts and graphics.

The new feature is now available via Claude Chat feature preview and through API access. Reportedly, the company also plans to include support for Google Vertex AI and Amazon Bedrock in the future. Claude 3.5 Sonnet AI is currently in Public beta. The model is now capable of analysing legal documents, financial reports, and translation by processing text and images, tables, and charts. The PDF feature can be used along with other features.

The new PDF tool processes files under 32 MB and 100 pages with standard token usage for each page ranging between 1,500 to 3,000. Reportedly, the model does not support password-protected or encrypted files.

For best outcomes, the company advises using documents that have readable text and pages that are properly aligned. To use specific sections of a document, users are encouraged to list page numbers. In the case of large documents, Anthropic recommends dividing them into smaller sections. One can try prompt caching while analysing the same document multiple times to enhance efficiency.

Anthropic released the latest version, Claude 3.5 Sonnet, in June this year. The company released its upgraded version last month. Based on the benchmarks in the public domain, the model reportedly outperformed Metaโ€™s Llama 400b, OpenAIโ€™s GPT-4o, and Googleโ€™s Gemini 1.5 Pro.  The latest version of the model is said to have enhanced natural language understanding capabilities. The Claude 3.5 Sonnet is currently available on Claude.ai and Claude iOS app.

Leave a comment