{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# 🍳 Multimodal Recipe Agent with LanceDB and PydanticAI\n",
        "\n",
        "In this tutorial, you'll build an intelligent AI agent that can understand both text and images to help users discover recipes. The agent uses LanceDB for multimodal data storage and PydanticAI for intelligent reasoning.\n",
        "\n",
        "## What You'll Learn\n",
        "\n",
        "- How to build AI agents with multimodal capabilities\n",
        "- Using LanceDB for efficient vector storage and retrieval\n",
        "- Creating custom tools for PydanticAI agents\n",
        "- Building conversational interfaces with Streamlit\n",
        "- Handling both text and image inputs in a single agent\n",
        "\n",
        "## Prerequisites\n",
        "\n",
        "This tutorial assumes you have:\n",
        "- Python 3.8+ installed\n",
        "- Basic understanding of vector databases\n",
        "- Familiarity with AI/ML concepts (helpful but not required)\n",
        "\n",
        "Let's get started!\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 1. Setup and Installation\n",
        "\n",
        "First, let's install the required dependencies:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Install required packages\n",
        "%pip install lancedb pydantic-ai sentence-transformers transformers pillow streamlit pandas numpy tqdm python-dotenv logfire\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 2. Data Preparation\n",
        "\n",
        "For this tutorial, we'll use a recipe dataset with both text and images. Let's start by setting up our data directory and downloading a sample dataset:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import pandas as pd\n",
        "from pathlib import Path\n",
        "import requests\n",
        "import zipfile\n",
        "\n",
        "# Create data directory\n",
        "data_dir = Path(\"data\")\n",
        "data_dir.mkdir(exist_ok=True)\n",
        "\n",
        "print(\"📁 Data directory created\")\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# For this tutorial, we'll create a sample dataset\n",
        "# In practice, you would load your actual recipe data\n",
        "sample_recipes = [\n",
        "    {\n",
        "        \"id\": \"recipe_1\",\n",
        "        \"title\": \"Classic Spaghetti Carbonara\",\n",
        "        \"ingredients\": [\"pasta\", \"eggs\", \"pancetta\", \"parmesan\", \"black pepper\"],\n",
        "        \"instructions\": \"Cook pasta according to package directions. In a bowl, whisk eggs with parmesan. Cook pancetta until crispy. Toss hot pasta with pancetta, then with egg mixture. Serve immediately.\",\n",
        "        \"image_name\": \"carbonara.jpg\"\n",
        "    },\n",
        "    {\n",
        "        \"id\": \"recipe_2\",\n",
        "        \"title\": \"Chocolate Chip Cookies\",\n",
        "        \"ingredients\": [\"flour\", \"butter\", \"sugar\", \"chocolate chips\", \"vanilla\", \"baking soda\"],\n",
        "        \"instructions\": \"Preheat oven to 375°F. Mix dry ingredients. Cream butter and sugars. Add eggs and vanilla. Combine wet and dry ingredients. Fold in chocolate chips. Bake 9-11 minutes.\",\n",
        "        \"image_name\": \"cookies.jpg\"\n",
        "    },\n",
        "    {\n",
        "        \"id\": \"recipe_3\",\n",
        "        \"title\": \"Grilled Salmon with Herbs\",\n",
        "        \"ingredients\": [\"salmon fillets\", \"olive oil\", \"dill\", \"lemon\", \"garlic\", \"salt\", \"pepper\"],\n",
        "        \"instructions\": \"Preheat grill. Season salmon with salt, pepper, and herbs. Brush with olive oil. Grill 4-5 minutes per side. Serve with lemon wedges.\",\n",
        "        \"image_name\": \"salmon.jpg\"\n",
        "    }\n",
        "]\n",
        "\n",
        "# Create sample CSV\n",
        "df = pd.DataFrame(sample_recipes)\n",
        "df.to_csv(\"data/recipes.csv\", index=False)\n",
        "\n",
        "print(f\"✅ Created sample dataset with {len(sample_recipes)} recipes\")\n",
        "print(\"\\nSample recipes:\")\n",
        "for recipe in sample_recipes:\n",
        "    print(f\"- {recipe['title']}\")\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 3. Setting Up LanceDB\n",
        "\n",
        "Now let's set up LanceDB to store our recipe data with both text and image embeddings:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import lancedb\n",
        "import numpy as np\n",
        "import torch\n",
        "from sentence_transformers import SentenceTransformer\n",
        "from transformers import CLIPModel, CLIPProcessor\n",
        "from PIL import Image\n",
        "import io\n",
        "import base64\n",
        "\n",
        "# Configuration\n",
        "LANCEDB_PATH = \"data/recipes.lance\"\n",
        "TEXT_MODEL = \"all-MiniLM-L6-v2\"\n",
        "IMAGE_MODEL = \"openai/clip-vit-base-patch32\"\n",
        "\n",
        "print(\"🔧 Setting up models and database...\")\n",
        "\n",
        "# Set device\n",
        "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
        "print(f\"Using device: {device}\")\n",
        "\n",
        "# Load models\n",
        "text_model = SentenceTransformer(TEXT_MODEL)\n",
        "text_model.to(device)\n",
        "\n",
        "image_model = CLIPModel.from_pretrained(IMAGE_MODEL)\n",
        "image_processor = CLIPProcessor.from_pretrained(IMAGE_MODEL)\n",
        "image_model.to(device)\n",
        "\n",
        "print(\"✅ Models loaded successfully\")\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Connect to LanceDB\n",
        "db = lancedb.connect(LANCEDB_PATH)\n",
        "\n",
        "# Create sample image data (in practice, you'd load actual images)\n",
        "def create_sample_image(width=224, height=224, color=(100, 150, 200)):\n",
        "    \"\"\"Create a sample image for demonstration\"\"\"\n",
        "    image = Image.new('RGB', (width, height), color)\n",
        "    return image\n",
        "\n",
        "# Process recipes and create embeddings\n",
        "recipes_data = []\n",
        "\n",
        "for i, recipe in enumerate(sample_recipes):\n",
        "    # Create text embedding\n",
        "    text_content = f\"{recipe['title']} {' '.join(recipe['ingredients'])} {recipe['instructions']}\"\n",
        "    text_embedding = text_model.encode([text_content], convert_to_tensor=True, device=device)\n",
        "    text_vector = text_embedding.cpu().numpy().flatten()\n",
        "    \n",
        "    # Create sample image and embedding\n",
        "    sample_image = create_sample_image()\n",
        "    \n",
        "    # Convert image to bytes for storage\n",
        "    img_buffer = io.BytesIO()\n",
        "    sample_image.save(img_buffer, format='JPEG')\n",
        "    image_binary = img_buffer.getvalue()\n",
        "    \n",
        "    # Create image embedding\n",
        "    inputs = image_processor(images=sample_image, return_tensors=\"pt\", padding=True)\n",
        "    inputs = {k: v.to(device) for k, v in inputs.items()}\n",
        "    \n",
        "    with torch.no_grad():\n",
        "        image_features = image_model.get_image_features(**inputs)\n",
        "        image_features = image_features / image_features.norm(dim=-1, keepdim=True)\n",
        "    \n",
        "    image_vector = image_features.cpu().numpy().flatten()\n",
        "    \n",
        "    # Prepare recipe data\n",
        "    recipe_data = {\n",
        "        \"id\": recipe[\"id\"],\n",
        "        \"title\": recipe[\"title\"],\n",
        "        \"ingredients\": recipe[\"ingredients\"],\n",
        "        \"instructions\": recipe[\"instructions\"],\n",
        "        \"image_name\": recipe[\"image_name\"],\n",
        "        \"text_embedding\": text_vector,\n",
        "        \"image_embedding\": image_vector,\n",
        "        \"image_binary\": image_binary,\n",
        "        \"num_ingredients\": len(recipe[\"ingredients\"])\n",
        "    }\n",
        "    \n",
        "    recipes_data.append(recipe_data)\n",
        "\n",
        "print(f\"✅ Processed {len(recipes_data)} recipes with embeddings\")\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Create LanceDB table\n",
        "if \"recipes\" in db.table_names():\n",
        "    db.drop_table(\"recipes\")\n",
        "\n",
        "table = db.create_table(\"recipes\", recipes_data)\n",
        "print(\"✅ LanceDB table created successfully\")\n",
        "print(f\"Table schema: {table.schema}\")\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 4. Building the AI Agent\n",
        "\n",
        "Now let's create our PydanticAI agent with custom tools for recipe search:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from pydantic_ai import Agent\n",
        "from typing import List, Dict, Any, Optional\n",
        "import base64\n",
        "\n",
        "class RecipeSearchTools:\n",
        "    \"\"\"Tools for the PydanticAI agent\"\"\"\n",
        "    \n",
        "    def __init__(self, db_path: str, text_model, image_model, image_processor, device):\n",
        "        self.db = lancedb.connect(db_path)\n",
        "        self.table = self.db.open_table(\"recipes\")\n",
        "        self.text_model = text_model\n",
        "        self.image_model = image_model\n",
        "        self.image_processor = image_processor\n",
        "        self.device = device\n",
        "    \n",
        "    def _safe_convert(self, value):\n",
        "        \"\"\"Safely convert numpy types to Python types for JSON serialization\"\"\"\n",
        "        import numpy as np\n",
        "        \n",
        "        if isinstance(value, np.ndarray):\n",
        "            if value.size == 1:\n",
        "                return value.item()\n",
        "            else:\n",
        "                return value.tolist()\n",
        "        elif hasattr(value, \"item\") and hasattr(value, \"size\") and value.size == 1:\n",
        "            return value.item()\n",
        "        elif hasattr(value, \"tolist\"):\n",
        "            return value.tolist()\n",
        "        elif isinstance(value, (list, tuple)):\n",
        "            return [self._safe_convert(item) for item in value]\n",
        "        else:\n",
        "            return value\n",
        "    \n",
        "    def search_recipes_by_text(self, query: str, limit: int = 5) -> List[Dict[str, Any]]:\n",
        "        \"\"\"Search recipes by text query\"\"\"\n",
        "        # Generate query embedding\n",
        "        query_embedding = self.text_model.encode(\n",
        "            [query], convert_to_tensor=True, device=self.device\n",
        "        )\n",
        "        query_vector = query_embedding.cpu().numpy().flatten()\n",
        "        \n",
        "        # Search in LanceDB\n",
        "        results = (\n",
        "            self.table.search(query_vector, vector_column_name=\"text_embedding\")\n",
        "            .limit(limit)\n",
        "            .to_pandas()\n",
        "        )\n",
        "        \n",
        "        # Convert to list of dicts\n",
        "        recipes = []\n",
        "        for _, row in results.iterrows():\n",
        "            recipe = {\n",
        "                \"id\": self._safe_convert(row[\"id\"]),\n",
        "                \"title\": self._safe_convert(row[\"title\"]),\n",
        "                \"ingredients\": self._safe_convert(row[\"ingredients\"]),\n",
        "                \"instructions\": self._safe_convert(row[\"instructions\"]),\n",
        "                \"num_ingredients\": self._safe_convert(row[\"num_ingredients\"]),\n",
        "                \"score\": self._safe_convert(row.get(\"_distance\", 0)),\n",
        "            }\n",
        "            recipes.append(recipe)\n",
        "        \n",
        "        return recipes\n",
        "    \n",
        "    def get_available_ingredients(self) -> List[str]:\n",
        "        \"\"\"Get all unique ingredients in the dataset\"\"\"\n",
        "        try:\n",
        "            results = self.table.search().select([\"ingredients\"]).to_pandas()\n",
        "            all_ingredients = set()\n",
        "            \n",
        "            for _, row in results.iterrows():\n",
        "                if row[\"ingredients\"]:\n",
        "                    all_ingredients.update(row[\"ingredients\"])\n",
        "            \n",
        "            return sorted(list(all_ingredients))\n",
        "        except Exception as e:\n",
        "            print(f\"Error getting ingredients: {e}\")\n",
        "            return []\n",
        "    \n",
        "    def get_recipes_with_images(self, query: str, limit: int = 5) -> str:\n",
        "        \"\"\"Search recipes and return formatted response\"\"\"\n",
        "        try:\n",
        "            recipes = self.search_recipes_by_text(query, limit)\n",
        "            \n",
        "            if not recipes:\n",
        "                return \"No recipes found matching your query.\"\n",
        "            \n",
        "            response_parts = []\n",
        "            response_parts.append(f\"Here are {len(recipes)} recipes that match your query:\\\\n\")\n",
        "            \n",
        "            for recipe in recipes:\n",
        "                response_parts.append(f\"## {recipe['title']}\")\n",
        "                response_parts.append(f\"**Ingredients:** {recipe['num_ingredients']} ingredients\")\n",
        "                \n",
        "                # Add ingredients list\n",
        "                if recipe.get(\"ingredients\"):\n",
        "                    ingredients_text = \", \".join(recipe[\"ingredients\"][:5])\n",
        "                    if len(recipe[\"ingredients\"]) > 5:\n",
        "                        ingredients_text += \"...\"\n",
        "                    response_parts.append(f\"*{ingredients_text}*\")\n",
        "                \n",
        "                # Add instructions preview\n",
        "                if recipe.get(\"instructions\"):\n",
        "                    instructions = recipe[\"instructions\"]\n",
        "                    if len(instructions) > 200:\n",
        "                        instructions = instructions[:200] + \"...\"\n",
        "                    response_parts.append(f\"**Instructions:** {instructions}\")\n",
        "                \n",
        "                response_parts.append(\"---\\\\n\")\n",
        "            \n",
        "            return \"\\\\n\".join(response_parts)\n",
        "        \n",
        "        except Exception as e:\n",
        "            return f\"Error searching recipes: {str(e)}\"\n",
        "\n",
        "print(\"✅ Recipe search tools defined\")\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Initialize tools\n",
        "tools_instance = RecipeSearchTools(\n",
        "    LANCEDB_PATH, text_model, image_model, image_processor, device\n",
        ")\n",
        "\n",
        "# Create PydanticAI agent\n",
        "agent = Agent(\n",
        "    \"gpt-4o-mini\",\n",
        "    tools=[\n",
        "        tools_instance.get_recipes_with_images,\n",
        "        tools_instance.search_recipes_by_text,\n",
        "        tools_instance.get_available_ingredients,\n",
        "    ],\n",
        "    system_prompt=\"\"\"You are a helpful recipe assistant. You can search for recipes by text.\n",
        "\n",
        "CRITICAL RULES - FOLLOW THESE EXACTLY:\n",
        "1. ALWAYS use the provided tools to search for recipes - NEVER generate recipe responses manually\n",
        "2. For ANY text-based recipe search request, use get_recipes_with_images tool\n",
        "3. These tools automatically format recipes with proper markdown\n",
        "4. DO NOT generate your own recipe responses - always use the tools\n",
        "\n",
        "Be helpful and provide detailed recipe information with proper markdown formatting.\"\"\",\n",
        ")\n",
        "\n",
        "print(\"✅ PydanticAI agent created successfully\")\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 5. Testing the Agent\n",
        "\n",
        "Let's test our agent with some sample queries:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import asyncio # Import asyncio to run the async agent\n",
        "\n",
        "# Test the agent with different queries\n",
        "query = \"Find me some pasta recipes\"\n",
        "print(f\"\\n🔍 Query: {query}\")\n",
        "print(\"-\" * 50)\n",
        "\n",
        "# In Colab, an asyncio event loop is already running.\n",
        "# We can directly await the async method of the agent.\n",
        "if 'agent' in locals(): # Check if agent was successfully created\n",
        "    result = await agent.run(query)\n",
        "    # Print the entire result object to inspect its structure\n",
        "    print(result.output)\n",
        "else:\n",
        "    print(\"Agent could not be initialized. Please ensure your OpenAI API key is set in Colab secrets.\")\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 6. Summary and Next Steps\n",
        "\n",
        "Congratulations! You've built a complete multimodal recipe agent with the following features:\n",
        "\n",
        "### What You've Accomplished\n",
        "\n",
        "1. **Multimodal Data Storage**: Used LanceDB to store both text and image embeddings\n",
        "2. **AI Agent Development**: Created a PydanticAI agent with custom tools\n",
        "3. **Semantic Search**: Implemented text-based recipe search using vector similarity\n",
        "4. **Production Features**: Added proper error handling and data conversion\n",
        "\n",
        "### Key Technologies Used\n",
        "\n",
        "- **LanceDB**: Multimodal vector database for efficient storage and retrieval\n",
        "- **PydanticAI**: Modern AI agent framework with type safety\n",
        "- **Sentence Transformers**: Text embeddings for semantic search\n",
        "- **CLIP**: Vision-language model for image understanding\n",
        "\n",
        "### Next Steps\n",
        "\n",
        "1. **Add Image Search**: Implement the image search functionality\n",
        "2. **Scale Up**: Use a larger recipe dataset\n",
        "3. **Deploy**: Deploy your agent to a cloud platform\n",
        "4. **Enhance UI**: Add more interactive features\n",
        "5. **Add More Tools**: Extend the agent with additional capabilities\n",
        "\n",
        "### Running Your Agent\n",
        "\n",
        "To run your complete recipe agent, you can create a simple script:\n",
        "\n",
        "```python\n",
        "# Simple test script\n",
        "result = agent.run_sync(\"Find me some dessert recipes\")\n",
        "print(result.data)\n",
        "```\n",
        "\n",
        "Your agent is now ready to help users discover recipes through natural language conversations!\n"
      ]
    }
  ],
  "metadata": {
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}
