claude-scripts/AGENTS.md at master

Files

Travis Vasceannie 812378c0e1 Refactor and enhance code quality analysis framework

- Updated AGENTS.md to provide comprehensive guidance on the Claude-Scripts project, including project overview, development commands, and architecture.
- Added new utility functions in hooks/guards/utils.py to support code quality checks and enhance modularity.
- Introduced HookResponseRequired TypedDict for stricter type checking in hook responses.
- Enhanced quality guard functionality with additional checks and improved type annotations across various modules.
- Updated pyproject.toml and uv.lock to include mypy as a development dependency for better type checking.
- Improved type checking configurations in pyrightconfig.json to exclude unnecessary directories and suppress specific warnings.

This update significantly improves the structure and maintainability of the code quality analysis toolkit, ensuring better adherence to type safety and project guidelines.

2025-10-26 09:43:47 +00:00

12 KiB

Raw Permalink Blame History

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Claude-Scripts is a comprehensive Python code quality analysis toolkit implementing a layered, plugin-based architecture for detecting duplicates, complexity metrics, and modernization opportunities. The system uses sophisticated similarity algorithms including LSH for scalable analysis of large codebases.

Development Commands

Essential Commands

# Activate virtual environment and install dependencies
source .venv/bin/activate && uv pip install -e ".[dev]"

# Run all quality checks
make check-all

# Run linting and auto-fix issues
make format

# Run type checking
make typecheck

# Run tests with coverage
make test-cov

# Run a single test
source .venv/bin/activate && pytest path/to/test_file.py::TestClass::test_method -xvs

# Install pre-commit hooks
make install-dev

# Build distribution packages
make build

CLI Usage Examples

# Detect duplicate code
claude-quality duplicates src/ --threshold 0.8 --format console

# Analyze complexity
claude-quality complexity src/ --threshold 10 --format json

# Modernization analysis
claude-quality modernization src/ --include-type-hints

# Full analysis
claude-quality full-analysis src/ --output report.json

# Create exceptions template
claude-quality create-exceptions-template --output-path .quality-exceptions.yaml

Architecture Overview

Core Design Pattern: Plugin-Based Analysis Pipeline

CLI Layer (cli/main.py) → Configuration (config/schemas.py) → Analysis Engines → Output Formatters

The system implements multiple design patterns:

Strategy Pattern: Similarity algorithms (LevenshteinSimilarity, JaccardSimilarity, etc.) are interchangeable
Visitor Pattern: AST traversal for code analysis
Factory Pattern: Dynamic engine creation based on configuration
Composite Pattern: Multiple engines combine for full_analysis

Critical Module Interactions

Duplicate Detection Flow:

FileFinder discovers Python files based on path configuration
ASTAnalyzer extracts code blocks (functions, classes, methods)
DuplicateDetectionEngine orchestrates analysis:
- For small codebases: Direct similarity comparison
- For large codebases (>1000 files): LSH-based scalable detection
SimilarityCalculator applies weighted algorithm combination
Results filtered through ExceptionFilter for configured suppressions

Similarity Algorithm System:

Multiple algorithms run in parallel with configurable weights
Algorithms grouped by type: text-based, token-based, structural, semantic
Final score = weighted combination of individual algorithm scores
LSH (Locality-Sensitive Hashing) enables O(n log n) scaling for large datasets

Configuration Hierarchy:

QualityConfig
├── detection: Algorithm weights, thresholds, LSH parameters
├── complexity: Metrics selection, thresholds per metric
├── languages: File extensions, language-specific rules
├── paths: Include/exclude patterns for file discovery
└── exceptions: Suppression rules with pattern matching

Key Implementation Details

Pydantic Version Constraint:

Must use Pydantic 2.5.x (not 2.6+ or 2.11+) due to compatibility issues
Configuration schemas use Pydantic for validation and defaults

AST Analysis Strategy:

Uses Python's standard ast module for parsing
Custom NodeVisitor subclasses for different analysis types
Preserves line numbers and column offsets for accurate reporting

Performance Optimizations:

File-based caching with configurable TTL
Parallel processing for multiple files
LSH indexing for large-scale duplicate detection
Incremental analysis support through cache

Testing Approach

Test Structure:

Unit tests for individual algorithms and components
Integration tests for end-to-end CLI commands
Property-based testing for similarity algorithms
Fixture-based test data in tests/fixtures/

Coverage Requirements:

Minimum 80% coverage enforced in CI
Focus on algorithm correctness and edge cases
Mocking external dependencies (file I/O, Git operations)

Important Configuration Files

pyproject.toml:

Package metadata and dependencies
Ruff configuration (linting rules)
MyPy configuration (type checking)
Pytest configuration (test discovery and coverage)

Makefile:

Standardizes development commands
Ensures virtual environment activation
Combines multiple tools into single targets

.pre-commit-config.yaml:

Automated code quality checks on commit
Includes ruff, mypy, and standard hooks

Code Quality Standards

Linting Configuration

Ruff with extensive rule selection (E, F, W, UP, ANN, etc.)
Ignored rules configured for pragmatic development
Auto-formatting enabled with make format

Type Checking

Strict MyPy configuration
All public APIs must have type annotations
Ignores for third-party libraries without stubs

Project Structure Conventions

Similarity algorithms inherit from BaseSimilarityAlgorithm
Analysis engines follow the analyze() → AnalysisResult pattern
Configuration uses Pydantic models with validation
Results formatted through dedicated formatter classes

Critical Dependencies

Analysis Core:

radon: Industry-standard complexity metrics
datasketch: LSH implementation for scalable similarity
python-Levenshtein: Fast string similarity

Infrastructure:

click: CLI framework with subcommand support
pydantic==2.5.3: Configuration and validation (version-locked)
pyyaml: Configuration file parsing

Development:

uv: Fast Python package manager (replaces pip)
pytest: Testing framework with coverage
ruff: Fast Python linter and formatter
mypy: Static type checking

0) Global Requirements

Python: Target 3.12+.
Typing: Modern syntax only (e.g., int | None; built-in generics like list[str]).
Validation: Pydantic v2+ only for schema/validation.
Complexity: Cyclomatic complexity < 15 per function/method.
Module Size: < 750 lines per module. If a module exceeds 750 lines, convert it into a package (e.g., module.py → package/__init__.py + package/module.py).
API Surface: Export functions via facades or classes so import sites remain concise.
Code Reuse: No duplication. Prefer helper extraction, composition, or extension.

1) Prohibited Constructs

❌ No Any: Do not import, alias, or use typing.Any, Any, or equivalents.
❌ No ignores: Do not use # type: ignore, # pyright: ignore, or similar.
❌ No casts: Do not use typing.cast or equivalents.

If a third-party library leaks Any, contain it using the allowed strategies below.

2) Allowed Strategies (instead of casts/ignores)

Apply one or more of these defensive typing techniques at integration boundaries.

2.1 Overloads (encode expectations)

Use overloads to express distinct input/return contracts.

from typing import overload, Literal

@overload
def fetch(kind: Literal["summary"]) -> str: ...
@overload
def fetch(kind: Literal["items"]) -> list[Item]: ...

def fetch(kind: str):
    raw = _raw_fetch(kind)
    return _normalize(kind, raw)

2.2 TypeGuard (safe narrowing)

Use TypeGuard to prove a shape and narrow types.

from typing import TypeGuard

def is_item(x: object) -> TypeGuard[Item]:
    return isinstance(x, dict) and isinstance(x.get("id"), str) and isinstance(x.get("value"), int)

2.3 TypedDict / dataclasses (normalize data)

Normalize untyped payloads immediately.

from typing import TypedDict

class Item(TypedDict):
    id: str
    value: int

def to_item(x: object) -> Item:
    if not isinstance(x, dict): raise TypeError("bad item")
    i, v = x.get("id"), x.get("value")
    if not isinstance(i, str) or not isinstance(v, int): raise TypeError("bad fields")
    return {"id": i, "value": v}

2.4 Protocols (structural typing)

Constrain usage via Protocol interfaces.

from typing import Protocol

class Saver(Protocol):
    def save(self, path: str) -> None: ...

2.5 Provide type stubs for the library

Create .pyi stubs to replace Any-heavy APIs with precise signatures. Place them in a local typings/ directory (or package) discoverable by the type checker.

thirdparty/__init__.pyi
thirdparty/client.pyi

# thirdparty/client.pyi
from typing import TypedDict

class Item(TypedDict):
    id: str
    value: int

class Client:
    def get_item(self, key: str) -> Item: ...
    def list_items(self, limit: int) -> list[Item]: ...

2.6 Typed wrapper (facade) around untyped libs

Expose only typed methods; validate at the boundary.

class ClientFacade:
    def __init__(self, raw: object) -> None:
        self._raw = raw

    def get_item(self, key: str) -> Item:
        data = self._raw.get_item(key)  # untyped
        return to_item(data)

3) Modern 3.12+ Typing Rules

Use X | None instead of Optional[X].
Use built-in collections: list[int], dict[str, str], set[str], tuple[int, ...].
Prefer Literal, TypedDict, Protocol, TypeAlias, Self, TypeVar, ParamSpec when appropriate.
Use match only when it improves readability and does not increase complexity beyond 14.

4) Pydantic v2+ Only

Use BaseModel (v2), model_validate, and model_dump.
Validation occurs at external boundaries (I/O, network, third-party libs).
Do not mix Pydantic with ad-hoc untyped dict usage internally; normalize once.

from pydantic import BaseModel

class ItemModel(BaseModel):
    id: str
    value: int

def to_item_model(x: object) -> ItemModel:
    return ItemModel.model_validate(x)

5) Packaging & Exports

Public imports should target facades or package __init__.py exports.
Keep import sites small and stable by consolidating exports.

# pkg/facade.py
from .service import Service
from .models import ItemModel
__all__ = ["Service", "ItemModel"]

# pkg/__init__.py
from .facade import Service, ItemModel
__all__ = ["Service", "ItemModel"]

6) Complexity & Structure

Refactor long functions into helpers.
Replace branching with strategy maps when possible.
Keep functions single-purpose; avoid deep nesting.
Document non-obvious invariants with brief docstrings or type comments (not ignores).

7) Testing Standards (pytest)

Use pytest.

Fixtures live in local conftest.py and must declare an appropriate scope: session, module, or function.

Prefer parameterization and marks to increase coverage without duplication.

# tests/test_items.py
import pytest

@pytest.mark.parametrize("raw,ok", [({"id":"a","value":1}, True), ({"id":1,"value":"x"}, False)])
def test_to_item(raw: dict[str, object], ok: bool) -> None:
    if ok:
        assert to_item(raw)["id"] == "a"
    else:
        with pytest.raises(TypeError):
            to_item(raw)

Constraints for tests:

Tests must not import from other tests.
Tests must not use conditionals or loops inside test bodies that introduce alternate code paths across assertions.
Prefer multiple parametrized cases over loops/ifs.
Organize fixtures in conftest.py and mark them with appropriate scopes.

Example fixture:

# tests/conftest.py
import pytest

@pytest.fixture(scope="module")
def fake_client() -> object:
    class _Raw:
        def get_item(self, key: str) -> dict[str, object]:
            return {"id": key, "value": 1}
    return _Raw()

8) Integration With Untyped Libraries

All direct interactions with untyped or Any-returning APIs must be quarantined in adapters/facades.
The rest of the codebase consumes only typed results.
Choose the least powerful strategy that satisfies typing (overload → guard → TypedDict/dataclass → Protocol → stubs → facade).

9) Review Checklist (apply before submitting code)

✅ No Any, no ignores, no casts.
✅ Modern 3.12 typing syntax only.
✅ Pydantic v2 used at boundaries.
✅ Complexity < 15 for every function.
✅ Module size < 750 lines (or split into package).
✅ Public imports go through a facade or class.
✅ No duplicate logic; helpers or composition extracted.
✅ Tests use pytest, fixtures in conftest.py, and parameterization/marks.
✅ Tests avoid importing from tests and avoid control flow that reduces clarity; use parametrization instead.
✅ Third-party Any is contained via allowed strategies.

12 KiB Raw Permalink Blame History