Case — Rocester (Intelligent Digital Catalog of Industrial Parts)

Type: AutoU (client: Rocester — industrial parts) Role: Full Stack / AI Developer — involved in founding the project (architecture and initial functional base) Status: Live (in production); further evolution on the roadmap Stack: Python, FastAPI, React + TypeScript, PostgreSQL (pgvector enabled), Gemini Vision (PDF extraction), JWT with roles, Docker Compose, ReportLab (roadmap), full-text search (RAG)

Confidentiality note: AutoU client project — validate what can be made public before exposing name/details.

Context and problem

Rocester's salespeople looked up parts catalogs in scanned supplier PDFs: finding a part, checking the OEM code and putting together a quote was slow and error-prone. Digitizing the catalogs manually was unfeasible given the volume.

Solution

A platform that turns catalog PDFs into an AI-searchable database:

  • PDF ingestion pipeline with Gemini Vision: catalog upload → structured part extraction (code, OEM, description, category, equipment) with temperature=0.1, JSON mode and retries
  • Confidence score per part (0.0–1.0) with textual reasoning: extracted parts enter as pending; a review panel with bulk actions (approve/reject) for efficient human curation
  • Search and filters by code, OEM, description, category and equipment
  • Quotes: cart, automatic code (ORC-AAMM-XXXX), history
  • AI assistant (chat): free-text part suggestions with RAG over the real catalog

Architecture and technical decisions

  • Dependency inversion in the AI layer: the domain doesn't know the LLM provider — swapping Gemini for another model doesn't touch the business logic (documented in the README architecture)
  • Explicit part lifecycle: extracted → pending review → approved/rejected — AI proposes, human disposes; the public catalog only contains curated data
  • Extraction with anti-hallucination controls: low temperature, strict JSON output, retries and a confidence score with a justification per item
  • JWT with roles (admin / seller): sellers search and quote; admins import and curate
  • pgvector already enabled in the database for evolving toward semantic search via embeddings (documented roadmap, including asynchronous ingestion processing and PDF quote export)
  • Technical-commercial proposal and project plan versioned alongside the code

Challenges and solutions

  • Heterogeneous supplier PDFs (tables, images, varying layouts): multimodal Vision + bulk human review balance automation and accuracy
  • Seller trust in the data: score + reasoning per part make "how sure is the AI" visible instead of hidden

Results and impact

  • PDF catalogs become a searchable base with no manual data entry [number of parts/catalogs TO CONFIRM]
  • Quotes assembled in the same flow as the search — less tool-switching for the seller
  • Bulk curation reduces the human cost of validating AI extraction
Wesley Correia

Full Stack Developer passionate about solving people's problems, crafting innovative solutions, and building amazing digital experiences.

Quick Links

Social

© 2026 Wesley de Carvalho Augusto Correia.All rights reserved.