Module

analysis.graph_analysis

Graph analysis module for Bengal SSG.

Provides structural analysis of knowledge graphs including connectivity scoring, hub/leaf detection, and page layering.

Classes

GraphAnalyzer
Analyzes knowledge graph structure for page connectivity insights. Provides methods for: - Connect…
8

Analyzes knowledge graph structure for page connectivity insights.

Provides methods for:

  • Connectivity scoring (incoming + outgoing refs)
  • Hub detection (highly connected pages)
  • Leaf detection (low connectivity pages)
  • Orphan detection (no connections)
  • Layer partitioning (for hub-first streaming builds)

Methods 6

get_connectivity
Get connectivity information for a specific page.
1 PageConnectivity
def get_connectivity(self, page: Page) -> PageConnectivity

Get connectivity information for a specific page.

Parameters 1
page Page

Page to analyze

Returns

PageConnectivity

PageConnectivity with detailed metrics

get_connectivity_score
Get total connectivity score for a page. Connectivity = incoming_refs + outgoi…
1 int
def get_connectivity_score(self, page: Page) -> int

Get total connectivity score for a page.

Connectivity = incoming_refs + outgoing_refs

Parameters 1
page Page

Page to analyze

Returns

int

Connectivity score (higher = more connected)

get_hubs
Get hub pages (highly connected pages). Hubs are pages with many incoming refe…
1 list[Page]
def get_hubs(self, threshold: int | None = None) -> list[Page]

Get hub pages (highly connected pages).

Hubs are pages with many incoming references. These are typically:

  • Index pages
  • Popular articles
  • Core documentation
Parameters 1
threshold int | None

Minimum incoming refs (defaults to graph.hub_threshold)

Returns

list[Page]

List of hub pages sorted by incoming references (descending)

get_leaves
Get leaf pages (low connectivity pages). Leaves are pages with few connections…
1 list[Page]
def get_leaves(self, threshold: int | None = None) -> list[Page]

Get leaf pages (low connectivity pages).

Leaves are pages with few connections. These are typically:

  • One-off blog posts
  • Changelog entries
  • Niche content
Parameters 1
threshold int | None

Maximum connectivity (defaults to graph.leaf_threshold)

Returns

list[Page]

List of leaf pages sorted by connectivity (ascending)

get_orphans
Get orphaned pages (no connections at all). Orphans are pages with no incoming…
0 list[Page]
def get_orphans(self) -> list[Page]

Get orphaned pages (no connections at all).

Orphans are pages with no incoming or outgoing references. These might be:

  • Forgotten content
  • Draft pages
  • Pages that should be linked from navigation
Returns

list[Page]

List of orphaned pages sorted by slug

get_layers
Partition pages into three layers by connectivity. Layers enable hub-first str…
0 PageLayers
def get_layers(self) -> PageLayers

Partition pages into three layers by connectivity.

Layers enable hub-first streaming builds:

  • Layer 0 (Hubs): High connectivity, process first, keep in memory
  • Layer 1 (Mid-tier): Medium connectivity, batch processing
  • Layer 2 (Leaves): Low connectivity, stream and release
Returns

PageLayers

PageLayers dataclass with hubs, mid_tier, and leaves attributes (supports tuple unpacking for backward compatibility)

Internal Methods 2
__init__
Initialize the graph analyzer.
1 None
def __init__(self, graph: KnowledgeGraph) -> None

Initialize the graph analyzer.

Parameters 1
graph KnowledgeGraph

Knowledge graph to analyze (must be built)

_ensure_built
Verify the graph has been built before analysis.
0 None
def _ensure_built(self) -> None

Verify the graph has been built before analysis.