* feat: enhance coverage reporting and improve tool configuration - Added support for JSON coverage reports in pyproject.toml. - Updated .gitignore to include coverage.json and task files for better management. - Introduced a new Type Safety Audit Report to document findings and recommendations for type safety improvements. - Created a comprehensive coverage configuration guide to assist in understanding coverage reporting setup. - Refactored tools configuration to utilize environment variables for concurrent scraping settings. These changes improve the project's testing and reporting capabilities while enhancing overall code quality and maintainability. * feat: enhance configuration handling and improve error logging - Introduced a new utility function `_get_env_int` for robust environment variable integer retrieval with validation. - Updated `WebToolsConfig` and `ToolsConfigModel` to utilize the new utility for environment variable defaults. - Enhanced logging in `CircuitBreaker` to provide detailed state transition information. - Improved URL handling in `url_analyzer.py` for better file extension extraction and normalization. - Added type validation and logging in `SecureInputMixin` to ensure input sanitization and validation consistency. These changes improve the reliability and maintainability of configuration management and error handling across the codebase. * refactor: update imports and enhance .gitignore for improved organization - Updated import paths in various example scripts to reflect the new structure under `biz_bud`. - Enhanced .gitignore to include clearer formatting for task files. - Removed obsolete function calls and improved error handling in several scripts. - Added public alias for backward compatibility in `upload_r2r.py`. These changes improve code organization, maintainability, and compatibility across the project. * refactor: update graph paths in langgraph.json for improved organization - Changed paths for research, catalog, paperless, and url_to_r2r graphs to reflect new directory structure. - Added new entries for analysis and scraping graphs to enhance functionality. These changes improve the organization and maintainability of the graph configurations. * fix: enhance validation and error handling in date range and scraping functions - Updated date validation in UserFiltersModel to ensure date values are strings. - Improved error messages in create_scraped_content_dict to clarify conditions for success and failure. - Enhanced test coverage for date validation and scraping content creation to ensure robustness. These changes improve input validation and error handling across the application, enhancing overall reliability. * refactor: streamline graph creation and enhance type annotations in examples - Simplified graph creation in `catalog_ingredient_research_example.py` and `catalog_tech_components_example.py` by directly compiling the graph. - Updated type annotations in `catalog_intel_with_config.py` for improved clarity and consistency. - Enhanced error handling in catalog data processing to ensure robustness against unexpected data types. These changes improve code readability, maintainability, and error resilience across example scripts. * Update src/biz_bud/nodes/extraction/extractors.py Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Update src/biz_bud/core/validation/pydantic_models.py Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * refactor: migrate Jina and Tavily clients to use ServiceFactory dependency injection * refactor: migrate URL processing to provider-based architecture with improved error handling * feat: add FirecrawlApp compatibility classes and mock implementations * fix: add thread-safe locking to LazyLoader factory management * feat: implement service restart and refactor cache decorator helpers * refactor: move r2r_direct_api_call to tools.clients.r2r_utils and improve HTTP service error handling * chore: update Sonar task IDs in report configuration --------- Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
56 lines
1.6 KiB
Python
56 lines
1.6 KiB
Python
"""Example usage of enhanced Firecrawl API endpoints."""
|
|
|
|
import asyncio
|
|
|
|
# Note: These imports are from the original firecrawl library
|
|
# They are not available in our current client implementation
|
|
# This example is disabled as it requires the actual firecrawl library
|
|
|
|
|
|
async def example_map_website():
|
|
"""Demonstrate using the map endpoint to discover URLs."""
|
|
print("This example requires the actual firecrawl-py library")
|
|
return
|
|
|
|
|
|
async def example_crawl_website():
|
|
"""Demonstrate using the crawl endpoint for deep website crawling."""
|
|
print("This example requires the actual firecrawl-py library")
|
|
return
|
|
|
|
|
|
async def example_search_and_scrape():
|
|
"""Demonstrate using the search endpoint to search and scrape results."""
|
|
print("This example requires the actual firecrawl-py library")
|
|
return
|
|
|
|
|
|
async def example_extract_structured_data():
|
|
"""Demonstrate using the extract endpoint for AI-powered extraction."""
|
|
print("This example requires the actual firecrawl-py library")
|
|
return
|
|
|
|
|
|
async def example_rag_integration():
|
|
"""Demonstrate using Firecrawl for RAG pipeline."""
|
|
print("This example requires the actual firecrawl-py library")
|
|
return
|
|
|
|
|
|
async def main():
|
|
"""Run all the Firecrawl examples."""
|
|
print("Enhanced Firecrawl API Examples")
|
|
print("=" * 40)
|
|
print("Note: These examples require the firecrawl-py library")
|
|
print()
|
|
|
|
await example_map_website()
|
|
await example_crawl_website()
|
|
await example_search_and_scrape()
|
|
await example_extract_structured_data()
|
|
await example_rag_integration()
|
|
|
|
|
|
if __name__ == "__main__":
|
|
asyncio.run(main())
|