Quick Overview This workflow ingests educational PDF URLs from Google Sheets, extracts and chunks their text, generates embeddings with Google Gemini, and stores them in a Supabase pgvector table for retrieval, while also exposing a public chat webhook that answers questions using Gemini and the same Supabase knowledge base. How it works 1. Runs every hour on a schedule and reads rows from a Google Sheets document, keeping only entries where the status is empty. 2. Processes each queued URL one

Quick Overview This workflow ingests educational PDF URLs from Google Sheets, extracts and chunks their text, generates embeddings with Google Gemini, and stores them in a Supabase pgvector table for retrieval, while also exposing a public chat webhook that answers questions using Gemini and the same Supabase knowledge base. How it works 1. Runs every hour on a schedule and reads rows from a Google Sheets document, keeping only entries where the status is empty. 2. Processes each queued URL one at a time and routes it based on whether it is a seraj-uae.com page or a direct Google Drive file link. 3. For seraj-uae.com URLs, fetches the HTML page to extract the PDF download link (or embedded Google Drive file ID) and the document title. 4. Downloads the PDF from either the source website via HTTP or from Google Drive, waits briefly, and extracts text from the PDF binary. 5. Cleans and validates the extracted text, then splits it into overlapping chunks and generates embeddings using Google Gemini. 6. Inserts the embedded chunks into a Supabase pgvector table with file/title/source metadata and updates the Google Sheets row to “Embedded” or “Not Text Based PDF” if no text is found. 7. Separately, exposes a public n8n Chat webhook that receives student questions and uses Gemini Flash with Supabase vector retrieval to return answers. Setup 1. Create a Supabase project with pgvector enabled, create the serajdocuments table, and add the matchserajdocuments function used for vector search. 2. Add Supabase API credentials in n8n and ensure the table name (serajdocuments) and query function name (matchserajdocuments) match your Supabase setup. 3. Add Google Sheets OAuth2 credentials and update the Google Sheet document ID/sheet reference, ensuring columns include sourceurl, status, and rownumber. 4. Add Google Drive OAuth2 credentials for downloading PDFs hosted in Drive. 5. Add a Google Gemini (PaLM) API credential for embeddings and Gemini Flash, then activate the workflow and copy the public chat webhook URL for your website chat widget.
Download the workflow JSON file after purchase.
Open n8n → click the menu → Import from File.
Select the downloaded JSON and import.
Set up credentials for each node that requires them.
Click Execute Workflow to test, then activate.
Setup guide included
Purchase to unlock the full step-by-step guide
No reviews yet
Be the first to buy and share your experience.
Leave a review
Sign in to share your experience with this workflow.
Create a free account to purchase workflows.
Need help setting this up?
Book a 3-hour live setup session with an Agility consultant.