Skip to content
#

multimodal-agent

Here are 8 public repositories matching this topic...

Build an end-to-end system that ingests inventory report PDFs/images, runs OCR to normalize and extract tabular data, stores the cleaned dataset, and exposes a secure, conversational agent that can answer business queries over the data (aggregation, filtering, joins, trends), returning tables, charts, and exportable results.

  • Updated Dec 5, 2025
  • Python

Improve this page

Add a description, image, and links to the multimodal-agent topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-agent topic, visit your repo's landing page and select "manage topics."

Learn more