Travis Vasceannie bdca72b0b3
Some checks failed
Integration Tests / Integration Tests (ubuntu-latest, 3.12) (push) Has been cancelled
Refine error handling and context management in core modules
- Enhance context isolation in specialized_exceptions.py to prevent shared state mutations and ensure safe metadata updates.
- Improve execution timeout handling in security.py with backward-compatible fallback mechanisms.
- Streamline price extraction logic in c_intel.py to handle various input types more effectively.
- Update fallback messaging in paperless agent to preserve context and provide clearer error status during LLM access failures.
- Clean up import statements in test configuration files for better organization.
2025-09-28 21:04:59 -04:00
2025-06-06 00:46:23 -04:00
2025-09-28 16:14:20 -04:00
2025-07-12 23:06:26 -04:00
2025-08-05 13:03:53 -04:00
2025-07-14 21:23:14 -04:00
2025-09-27 23:06:13 -04:00
2025-09-27 23:06:13 -04:00
2025-07-12 23:06:26 -04:00
2025-07-17 21:22:04 -04:00
2025-07-17 18:32:58 -04:00
2025-07-17 18:32:58 -04:00
2025-08-01 21:18:22 -04:00
2025-08-01 21:18:22 -04:00
2025-08-05 13:03:53 -04:00
2025-08-01 21:18:22 -04:00
2025-06-06 00:46:23 -04:00
2025-07-12 23:06:26 -04:00
2025-08-05 13:03:53 -04:00
2025-08-05 13:03:53 -04:00
2025-07-17 18:32:58 -04:00
2025-07-14 21:23:14 -04:00
2025-07-17 18:32:58 -04:00
2025-07-14 21:23:14 -04:00
2025-08-05 13:03:53 -04:00
2025-07-20 13:21:05 -04:00
2025-09-27 23:06:13 -04:00

Business Buddy (Biz Budz)

CI Integration Tests

Business Buddy is a sophisticated AI agent framework built on LangGraph, designed for business research, analysis, and document processing workflows. It provides a modular architecture for creating, managing, and executing AI-powered tasks with built-in support for various LLM providers, advanced RAG capabilities, and comprehensive data processing tools.

🚀 Features

Core Capabilities

  • Advanced RAG Integration: Full R2R (Retrieval-Augmented Retrieval) support with document deduplication, batch processing, and intelligent collection management
  • Multi-LLM Support: Compatible with OpenAI, Anthropic, Google VertexAI, Cohere, and more
  • Modular Architecture: Organized into reusable nodes, graphs, and services for easy extension
  • Type Safety: Comprehensive type hints with Pydantic models throughout
  • Asynchronous by Design: Built for high-performance concurrent operations

Specialized Workflows

  • Market Research: Automated business and market analysis workflows
  • Menu Intelligence: Restaurant menu analysis and extraction
  • Document Processing: URL-to-R2R pipeline with intelligent content analysis
  • Web Scraping: Multiple strategies including Firecrawl, BeautifulSoup, and browser automation
  • Search Orchestration: Multi-provider search with caching and result ranking

📁 Project Structure

biz-budz/
├── src/biz_bud/          # Main application code
│   ├── graphs/           # LangGraph workflow definitions
│   ├── nodes/            # Modular processing nodes
│   │   ├── analysis/     # Data analysis and visualization
│   │   ├── core/         # Core functionality
│   │   ├── llm/          # LLM interactions
│   │   ├── rag/          # RAG and R2R integration
│   │   ├── research/     # Research workflows
│   │   ├── scraping/     # Web scraping strategies
│   │   ├── search/       # Search orchestration
│   │   └── validation/   # Content validation
│   ├── services/         # External service integrations
│   └── states/           # TypedDict state definitions
├── packages/             # Modular utility packages
│   ├── business-buddy-core/      # Core utilities
│   ├── business-buddy-extraction/# Entity extraction
│   ├── business-buddy-tools/     # Web tools & scrapers
│   └── business-buddy-utils/     # General utilities
└── examples/             # Usage examples and demos

🛠️ Installation

Prerequisites

Quick Setup

  1. Clone and setup:

    git clone https://github.com/vasceannie/competitor-costing.git biz-budz
    cd biz-budz
    ./scripts/setup-dev.sh
    
  2. Configure environment:

    cp .env.example .env
    # Edit .env with your API keys:
    # - OPENAI_API_KEY
    # - ANTHROPIC_API_KEY (optional)
    # - TAVILY_API_KEY (for web search)
    # - FIRECRAWL_API_KEY (for advanced scraping)
    # - R2R_BASE_URL (if using R2R)
    
  3. Start development services:

    make start  # Starts PostgreSQL, Redis, Qdrant
    

💻 Development

Commands

# Environment activation (always use this)
source .venv/bin/activate

# Run all code quality checks
make lint-all

# Run tests with coverage
make test

# Run tests in watch mode
make test_watch

# Format code
make format

# Run pre-commit hooks
make pre-commit

# Start/stop Docker services
make start
make stop

Code Quality Standards

This project enforces strict code quality:

  • Type Safety: No Any types or # type: ignore allowed
  • Linting: Ruff for style, Pyrefly for advanced type checking
  • Testing: Minimum 70% coverage requirement
  • Documentation: Imperative docstrings with punctuation
  • Pre-commit: Automatic hooks for all quality checks

Testing

# Run all tests
pytest

# Run specific test file
pytest tests/unit_tests/nodes/rag/test_analyzer.py

# Run with coverage report
pytest --cov=biz_bud --cov-report=html

# Run integration tests only
pytest tests/integration_tests/

🔧 Configuration

Business Buddy uses a hierarchical configuration system:

  1. Environment Variables (highest priority)
  2. YAML Configuration (config.yaml)
  3. Default Values (lowest priority)

Example configuration usage:

from biz_bud.config.loader import load_config

config = load_config()
# Access nested configuration
llm_config = config.llm_config
api_keys = config.api_config

📚 Usage Examples

Running a Research Workflow

from biz_bud.graphs.research import research_graph

# Execute research workflow
result = await research_graph.ainvoke({
    "messages": [HumanMessage(content="Research the coffee shop market in Seattle")],
    "config": config
})

URL to R2R Document Processing

from biz_bud.graphs.url_to_r2r import url_to_r2r_graph

# Process URLs and upload to R2R
result = await url_to_r2r_graph.ainvoke({
    "url": "https://docs.example.com",
    "config": config
})

Using the RAG Agent

# from biz_bud.agents.rag_agent import create_rag_agent_executor  # Module deleted

# agent = create_rag_agent_executor(config)
# result = await agent.ainvoke({
    "messages": [HumanMessage(content="What are the key features of R2R?")]
})

🏗️ Architecture Highlights

Key Design Patterns

  • State-Driven Workflows: TypedDict states ensure type safety across graph executions
  • Service Abstraction: Clean interfaces for all external dependencies
  • Decorator Pattern: Centralized error handling and logging via @log_config and @error_handling
  • Modular Nodes: Each node has single responsibility and is independently testable
  • Parallel Processing: Extensive use of asyncio for concurrent operations

RAG Integration

  • R2R Support: Full integration with R2R for document storage and retrieval
  • Intelligent Deduplication: Content-based and URL-based duplicate detection
  • Batch Processing: Efficient handling of large document sets
  • Collection Management: Automatic collection assignment based on source domains

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes with tests
  4. Ensure all checks pass (make lint-all && make test)
  5. Commit with descriptive message
  6. Push and create a Pull Request

Development Principles

  • Always use UV for package management
  • Ensure all code is strongly typed
  • Write tests for new functionality
  • Follow existing code patterns and conventions
  • Never use --no-verify flag for commits

🚀 CI/CD

GitHub Actions workflows ensure code quality:

  1. Code Quality (lint.yml): Runs all linters and type checkers
  2. Unit Tests (unit-tests.yml): Executes test suite with coverage
  3. Integration Tests (integration-tests.yml): Validates full workflows

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgements

  • LangChain - Foundation for agent development
  • LangGraph - Graph-based workflow orchestration
  • R2R - RAG system integration
  • Firecrawl - Advanced web scraping
  • UV - Fast Python package management

Note: Always activate the virtual environment with .venv/bin/activate and use UV for all package management operations.

Description
No description provided
Readme 17 MiB
Languages
Python 99.5%
Shell 0.3%