Go to file

Integration Tests / Integration Tests (ubuntu-latest, 3.12) (push) Has been cancelled

Details

Refine error handling and context management in core modules

- Enhance context isolation in specialized_exceptions.py to prevent shared state mutations and ensure safe metadata updates.
- Improve execution timeout handling in security.py with backward-compatible fallback mechanisms.
- Streamline price extraction logic in c_intel.py to handle various input types more effectively.
- Update fallback messaging in paperless agent to preserve context and provide clearer error status during LLM access failures.
- Clean up import statements in test configuration files for better organization.

2025-09-28 21:04:59 -04:00

.claude

feat: enhance metric tracking and memory management utilities

2025-08-07 23:40:40 -04:00

.cursor/rules

Semantic (#15 )

2025-06-06 00:46:23 -04:00

.devcontainer

dockerfile fix

2025-09-28 16:14:20 -04:00

.github/workflows

feat: enhance coverage reporting and improve tool configuration (#55 )

2025-08-04 00:54:52 -04:00

.roo

Repopatch (#31 )

2025-07-12 23:06:26 -04:00

.sonar

fix: improve error handling in async cleanup and shutdown processes

2025-08-08 00:05:48 -04:00

.vscode

Tests (#56 )

2025-08-05 13:03:53 -04:00

.windsurf/rules

Vasceannie/issue32 (#41 )

2025-07-14 21:23:14 -04:00

docker

fix: resolve pre-commit hooks configuration and update dependencies

2025-08-07 18:51:42 -04:00

docs

fix: resolve pre-commit hooks configuration and update dependencies

2025-08-07 18:51:42 -04:00

examples

stuff

2025-09-27 23:06:13 -04:00

scripts

stuff

2025-09-27 23:06:13 -04:00

src

Refine error handling and context management in core modules

2025-09-28 21:04:59 -04:00

static

Repopatch (#31 )

2025-07-12 23:06:26 -04:00

tests

Refine error handling and context management in core modules

2025-09-28 21:04:59 -04:00

.dockerignore

refac

2025-07-17 21:22:04 -04:00

.env.example

route-n-plan (#44 )

2025-07-17 18:32:58 -04:00

.env.production

route-n-plan (#44 )

2025-07-17 18:32:58 -04:00

.flake8

Bb-core-restoration-backup (#54 )

2025-08-01 21:18:22 -04:00

.gitignore

Modernize research graph metadata for LangGraph v1 (#60 )

2025-09-19 03:01:18 -04:00

.mcp.json

Bb-core-restoration-backup (#54 )

2025-08-01 21:18:22 -04:00

.pre-commit-config.yaml

fix: resolve pre-commit hooks configuration and update dependencies

2025-08-07 18:51:42 -04:00

.pre-commit-test-policy.yaml

Tests (#56 )

2025-08-05 13:03:53 -04:00

.pylintrc

Bb-core-restoration-backup (#54 )

2025-08-01 21:18:22 -04:00

.repomixignore

Semantic (#15 )

2025-06-06 00:46:23 -04:00

.roomodes

Repopatch (#31 )

2025-07-12 23:06:26 -04:00

.sourcery.yaml

Bb-core-restoration-backup (#54 )

2025-08-01 21:18:22 -04:00

add-git-lab-cert.sh

Add script and certificate for git.lab integration

2025-09-28 20:14:12 -04:00

AGENTS.md

fix: resolve all pyrefly linting errors in Discord implementation

2025-09-20 18:17:56 -04:00

CLAUDE.local.md

feat: enhance coverage reporting and improve tool configuration (#55 )

2025-08-04 00:54:52 -04:00

CLAUDE.md

Tests (#56 )

2025-08-05 13:03:53 -04:00

config.yaml

Tests (#56 )

2025-08-05 13:03:53 -04:00

deploy.sh

route-n-plan (#44 )

2025-07-17 18:32:58 -04:00

dev.sh

Vasceannie/issue32 (#41 )

2025-07-14 21:23:14 -04:00

docker-compose.production.yml

route-n-plan (#44 )

2025-07-17 18:32:58 -04:00

Dockerfile.production

feat: enhance coverage reporting and improve tool configuration (#55 )

2025-08-04 00:54:52 -04:00

errors.md

Refactor type safety checks and enhance error handling across various modules

2025-09-28 13:45:52 -04:00

git.lab.crt

Add script and certificate for git.lab integration

2025-09-28 20:14:12 -04:00

langgraph.json

fix: resolve pre-commit hooks configuration and update dependencies

2025-08-07 18:51:42 -04:00

LICENSE

feat: add core embeddings functionality with multi-provider support and Jina integration

2025-05-07 14:25:50 -04:00

Makefile

feat: enhance coverage reporting and improve tool configuration (#55 )

2025-08-04 00:54:52 -04:00

mypy.ini

feat: enhance coverage reporting and improve tool configuration (#55 )

2025-08-04 00:54:52 -04:00

nginx.conf

route-n-plan (#44 )

2025-07-17 18:32:58 -04:00

package-lock.json

Bb-core-restoration-backup (#54 )

2025-08-01 21:18:22 -04:00

package.json

Vasceannie/issue32 (#41 )

2025-07-14 21:23:14 -04:00

pyproject.toml

Modernize research graph metadata for LangGraph v1 (#60 )

2025-09-19 03:01:18 -04:00

pyrefly.toml

Refactor type safety checks and enhance error handling across various modules

2025-09-28 13:45:52 -04:00

pyrightconfig.json

Refactor type safety checks and enhance error handling across various modules

2025-09-28 13:45:52 -04:00

pytest-vscode.ini

Tests (#56 )

2025-08-05 13:03:53 -04:00

README.md

Cleanup (#45 )

2025-07-20 13:21:05 -04:00

repomix.config.json

Bb-core-restoration-backup (#54 )

2025-08-01 21:18:22 -04:00

requirements.txt

Modernize research graph metadata for LangGraph v1 (#60 )

2025-09-19 03:01:18 -04:00

sonar-project.properties

feat: enhance coverage reporting and improve tool configuration (#55 )

2025-08-04 00:54:52 -04:00

TYPE_SAFETY_REFACTOR_PLAN.md

stuff

2025-09-27 23:06:13 -04:00

uv.lock

Refine error centralization for graph and HTTP utilities (#63 )

2025-09-20 21:42:29 -04:00

README.md

Business Buddy (Biz Budz)

Business Buddy is a sophisticated AI agent framework built on LangGraph, designed for business research, analysis, and document processing workflows. It provides a modular architecture for creating, managing, and executing AI-powered tasks with built-in support for various LLM providers, advanced RAG capabilities, and comprehensive data processing tools.

🚀 Features

Core Capabilities

Advanced RAG Integration: Full R2R (Retrieval-Augmented Retrieval) support with document deduplication, batch processing, and intelligent collection management
Multi-LLM Support: Compatible with OpenAI, Anthropic, Google VertexAI, Cohere, and more
Modular Architecture: Organized into reusable nodes, graphs, and services for easy extension
Type Safety: Comprehensive type hints with Pydantic models throughout
Asynchronous by Design: Built for high-performance concurrent operations

Specialized Workflows

Market Research: Automated business and market analysis workflows
Menu Intelligence: Restaurant menu analysis and extraction
Document Processing: URL-to-R2R pipeline with intelligent content analysis
Web Scraping: Multiple strategies including Firecrawl, BeautifulSoup, and browser automation
Search Orchestration: Multi-provider search with caching and result ranking

📁 Project Structure

biz-budz/
├── src/biz_bud/          # Main application code
│   ├── graphs/           # LangGraph workflow definitions
│   ├── nodes/            # Modular processing nodes
│   │   ├── analysis/     # Data analysis and visualization
│   │   ├── core/         # Core functionality
│   │   ├── llm/          # LLM interactions
│   │   ├── rag/          # RAG and R2R integration
│   │   ├── research/     # Research workflows
│   │   ├── scraping/     # Web scraping strategies
│   │   ├── search/       # Search orchestration
│   │   └── validation/   # Content validation
│   ├── services/         # External service integrations
│   └── states/           # TypedDict state definitions
├── packages/             # Modular utility packages
│   ├── business-buddy-core/      # Core utilities
│   ├── business-buddy-extraction/# Entity extraction
│   ├── business-buddy-tools/     # Web tools & scrapers
│   └── business-buddy-utils/     # General utilities
└── examples/             # Usage examples and demos

🛠️ Installation

Prerequisites

Python 3.12+
UV package manager
Docker (for development services)

Quick Setup

Clone and setup:

git clone https://github.com/vasceannie/competitor-costing.git biz-budz
cd biz-budz
./scripts/setup-dev.sh

Configure environment:

cp .env.example .env
# Edit .env with your API keys:
# - OPENAI_API_KEY
# - ANTHROPIC_API_KEY (optional)
# - TAVILY_API_KEY (for web search)
# - FIRECRAWL_API_KEY (for advanced scraping)
# - R2R_BASE_URL (if using R2R)

Start development services:

make start  # Starts PostgreSQL, Redis, Qdrant

💻 Development

Commands

# Environment activation (always use this)
source .venv/bin/activate

# Run all code quality checks
make lint-all

# Run tests with coverage
make test

# Run tests in watch mode
make test_watch

# Format code
make format

# Run pre-commit hooks
make pre-commit

# Start/stop Docker services
make start
make stop

Code Quality Standards

This project enforces strict code quality:

Type Safety: No Any types or # type: ignore allowed
Linting: Ruff for style, Pyrefly for advanced type checking
Testing: Minimum 70% coverage requirement
Documentation: Imperative docstrings with punctuation
Pre-commit: Automatic hooks for all quality checks

Testing

# Run all tests
pytest

# Run specific test file
pytest tests/unit_tests/nodes/rag/test_analyzer.py

# Run with coverage report
pytest --cov=biz_bud --cov-report=html

# Run integration tests only
pytest tests/integration_tests/

🔧 Configuration

Business Buddy uses a hierarchical configuration system:

Environment Variables (highest priority)
YAML Configuration (config.yaml)
Default Values (lowest priority)

Example configuration usage:

from biz_bud.config.loader import load_config

config = load_config()
# Access nested configuration
llm_config = config.llm_config
api_keys = config.api_config

📚 Usage Examples

Running a Research Workflow

from biz_bud.graphs.research import research_graph

# Execute research workflow
result = await research_graph.ainvoke({
    "messages": [HumanMessage(content="Research the coffee shop market in Seattle")],
    "config": config
})

URL to R2R Document Processing

from biz_bud.graphs.url_to_r2r import url_to_r2r_graph

# Process URLs and upload to R2R
result = await url_to_r2r_graph.ainvoke({
    "url": "https://docs.example.com",
    "config": config
})

Using the RAG Agent

# from biz_bud.agents.rag_agent import create_rag_agent_executor  # Module deleted

# agent = create_rag_agent_executor(config)
# result = await agent.ainvoke({
    "messages": [HumanMessage(content="What are the key features of R2R?")]
})

🏗️ Architecture Highlights

Key Design Patterns

State-Driven Workflows: TypedDict states ensure type safety across graph executions
Service Abstraction: Clean interfaces for all external dependencies
Decorator Pattern: Centralized error handling and logging via @log_config and @error_handling
Modular Nodes: Each node has single responsibility and is independently testable
Parallel Processing: Extensive use of asyncio for concurrent operations

RAG Integration

R2R Support: Full integration with R2R for document storage and retrieval
Intelligent Deduplication: Content-based and URL-based duplicate detection
Batch Processing: Efficient handling of large document sets
Collection Management: Automatic collection assignment based on source domains

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes with tests
Ensure all checks pass (make lint-all && make test)
Commit with descriptive message
Push and create a Pull Request

Development Principles

Always use UV for package management
Ensure all code is strongly typed
Write tests for new functionality
Follow existing code patterns and conventions
Never use --no-verify flag for commits

🚀 CI/CD

GitHub Actions workflows ensure code quality:

Code Quality (lint.yml): Runs all linters and type checkers
Unit Tests (unit-tests.yml): Executes test suite with coverage
Integration Tests (integration-tests.yml): Validates full workflows

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgements

LangChain - Foundation for agent development
LangGraph - Graph-based workflow orchestration
R2R - RAG system integration
Firecrawl - Advanced web scraping
UV - Fast Python package management

Note: Always activate the virtual environment with .venv/bin/activate and use UV for all package management operations.