Classes
Community
dataclass
A community of related pages discovered through link structure.
Represents a group of pages that a…
Community
dataclass A community of related pages discovered through link structure.
Represents a group of pages that are densely connected to each other and share similar topics or themes. Useful for understanding content organization and identifying topic clusters.
Attributes
| Name | Type | Description |
|---|---|---|
id |
int |
Unique community identifier |
pages |
set[Page] |
Set of pages belonging to this community |
size |
— | Number of pages in the community |
density |
— | Internal connection density (0.0-1.0) |
Methods 2
size
property
Number of pages in this community.
size
property def size(self) -> int
Number of pages in this community.
Returns
int
get_top_pages_by_degree
Get most connected pages in this community.
get_top_pages_by_degree
def get_top_pages_by_degree(self, limit: int = 5) -> list[Page]
Get most connected pages in this community.
Parameters 1
limit |
int |
Returns
list[Page]
CommunityDetectionResults
dataclass
Results from community detection analysis.
Contains discovered communities and quality metrics. Co…
CommunityDetectionResults
dataclass Results from community detection analysis.
Contains discovered communities and quality metrics. Communities represent natural groupings of related pages based on link structure.
Attributes
| Name | Type | Description |
|---|---|---|
communities |
list[Community] |
List of detected communities |
modularity |
float |
Modularity score (quality metric, -1.0 to 1.0, higher is better) |
iterations |
int |
|
num_communities |
— | Total number of communities detected |
Methods 3
get_community_for_page
Find which community a page belongs to.
get_community_for_page
def get_community_for_page(self, page: Page) -> Community | None
Find which community a page belongs to.
Parameters 1
page |
Page |
Returns
Community | None
get_largest_communities
Get largest communities by page count.
get_largest_communities
def get_largest_communities(self, limit: int = 10) -> list[Community]
Get largest communities by page count.
Parameters 1
limit |
int |
Returns
list[Community]
get_communities_above_size
Get communities with at least min_size pages.
get_communities_above_size
def get_communities_above_size(self, min_size: int) -> list[Community]
Get communities with at least min_size pages.
Parameters 1
min_size |
int |
Returns
list[Community]
LouvainCommunityDetector
Detect communities using the Louvain method.
The Louvain algorithm is a greedy optimization method…
LouvainCommunityDetector
Detect communities using the Louvain method.
The Louvain algorithm is a greedy optimization method that attempts to optimize the modularity of a partition of the network. It runs in two phases:
Modularity Optimization: Each node is moved to the community that yields the largest increase in modularity.
Community Aggregation: A new network is built where nodes are communities and edges represent connections between communities.
These phases are repeated until no further improvement is possible.
Methods 1
detect
Detect communities using Louvain method.
detect
def detect(self) -> CommunityDetectionResults
Detect communities using Louvain method.
Returns
CommunityDetectionResults with discovered communitiesCommunityDetectionResults
—
Internal Methods 6
__init__
Initialize Louvain community detector.
__init__
def __init__(self, graph: KnowledgeGraph, resolution: float = 1.0, random_seed: int | None = None)
Initialize Louvain community detector.
Parameters 3
graph |
KnowledgeGraph |
KnowledgeGraph with page connections |
resolution |
float |
Resolution parameter (higher = more communities) |
random_seed |
int | None |
Random seed for reproducibility |
_build_edge_weights
Build edge weights from the graph.
Uses frozenset to represent undirected edges.
_build_edge_weights
def _build_edge_weights(self, pages: list[Page]) -> dict[frozenset[Page], float]
Build edge weights from the graph.
Uses frozenset to represent undirected edges.
Parameters 1
pages |
list[Page] |
Returns
dict[frozenset[Page], float]
_compute_node_degrees
Compute weighted degree for each node.
_compute_node_degrees
def _compute_node_degrees(self, pages: list[Page], edge_weights: dict[frozenset[Page], float]) -> dict[Page, float]
Compute weighted degree for each node.
Parameters 2
pages |
list[Page] |
|
edge_weights |
dict[frozenset[Page], float] |
Returns
dict[Page, float]
_get_neighboring_communities
Get communities that are neighbors of this page.
_get_neighboring_communities
def _get_neighboring_communities(self, page: Page, page_to_community: dict[Page, int], edge_weights: dict[frozenset[Page], float]) -> set[int]
Get communities that are neighbors of this page.
Parameters 3
page |
Page |
|
page_to_community |
dict[Page, int] |
|
edge_weights |
dict[frozenset[Page], float] |
Returns
set[int]
_modularity_gain
Calculate modularity gain from moving page to new community.
This uses the fas…
_modularity_gain
def _modularity_gain(self, page: Page, to_community: int, page_to_community: dict[Page, int], edge_weights: dict[frozenset[Page], float], node_degrees: dict[Page, float], total_weight: float) -> float
Calculate modularity gain from moving page to new community.
This uses the fast incremental formula for modularity change.
Parameters 6
page |
Page |
|
to_community |
int |
|
page_to_community |
dict[Page, int] |
|
edge_weights |
dict[frozenset[Page], float] |
|
node_degrees |
dict[Page, float] |
|
total_weight |
float |
Returns
float
_compute_modularity
Compute Newman's modularity Q.
_compute_modularity
def _compute_modularity(self, page_to_community: dict[Page, int], edge_weights: dict[frozenset[Page], float], node_degrees: dict[Page, float], total_weight: float) -> float
Compute Newman's modularity Q.
Parameters 4
page_to_community |
dict[Page, int] |
|
edge_weights |
dict[frozenset[Page], float] |
|
node_degrees |
dict[Page, float] |
|
total_weight |
float |
Returns
float
Functions
detect_communities
Convenience function to detect communities.
detect_communities
def detect_communities(graph: KnowledgeGraph, resolution: float = 1.0, random_seed: int | None = None) -> CommunityDetectionResults
Convenience function to detect communities.
Parameters 3
| Name | Type | Default | Description |
|---|---|---|---|
graph |
KnowledgeGraph |
— | KnowledgeGraph with page connections |
resolution |
float |
1.0 |
Resolution parameter (higher = more communities) |
random_seed |
int | None |
None |
Random seed for reproducibility |
Returns
CommunityDetectionResults with discovered communitiesCommunityDetectionResults
—