Classes
PageRankResults
dataclass
Results from PageRank computation.
Contains importance scores for all pages based on the link stru…
PageRankResults
dataclass Results from PageRank computation.
Contains importance scores for all pages based on the link structure. Pages linked to by many important pages receive high scores.
Attributes
| Name | Type | Description |
|---|---|---|
scores |
dict[Page, float] |
Map of pages to PageRank scores (normalized, sum to 1.0) |
iterations |
int |
Number of iterations until convergence |
converged |
bool |
Whether the algorithm converged within max_iterations |
damping_factor |
float |
Methods 3
get_top_pages
Get top-ranked pages.
get_top_pages
def get_top_pages(self, limit: int = 20) -> list[tuple[Page, float]]
Get top-ranked pages.
Parameters 1
limit |
int |
Number of pages to return |
Returns
List of (page, score) tuples sorted by score descendinglist[tuple[Page, float]]
—
get_pages_above_percentile
Get pages above a certain percentile.
get_pages_above_percentile
def get_pages_above_percentile(self, percentile: int) -> set[Page]
Get pages above a certain percentile.
Parameters 1
percentile |
int |
Percentile threshold (0-100) |
Returns
Set of pages above the thresholdset[Page]
—
get_score
Get PageRank score for a specific page.
get_score
def get_score(self, page: Page) -> float
Get PageRank score for a specific page.
Parameters 1
page |
Page |
Returns
float
PageRankCalculator
Compute PageRank scores for pages in a site graph.
PageRank is a link analysis algorithm that assi…
PageRankCalculator
Compute PageRank scores for pages in a site graph.
PageRank is a link analysis algorithm that assigns numerical weights to pages based on their link structure. Pages that are linked to by many important pages receive high scores.
The algorithm uses an iterative approach:
- Initialize all pages with equal probability (1/N)
- Iteratively update scores based on incoming links
- Continue until convergence or max iterations
Methods 2
compute
Compute PageRank scores for all pages.
compute
def compute(self, seed_pages: set[Page] | None = None, personalized: bool = False) -> PageRankResults
Compute PageRank scores for all pages.
Parameters 2
seed_pages |
set[Page] | None |
Optional set of pages for personalized PageRank Random jumps go only to these pages |
personalized |
bool |
If True, use personalized PageRank |
Returns
PageRankResults with scores and metadataPageRankResults
—
compute_personalized
Compute personalized PageRank from seed pages.
Personalized PageRank biases ra…
compute_personalized
def compute_personalized(self, seed_pages: set[Page]) -> PageRankResults
Compute personalized PageRank from seed pages.
Personalized PageRank biases random jumps toward seed pages, useful for finding pages related to a specific topic.
Parameters 1
seed_pages |
set[Page] |
Set of pages to bias toward |
Returns
PageRankResults with personalized scoresPageRankResults
—
Internal Methods 1
__init__
Initialize PageRank calculator.
__init__
def __init__(self, graph: KnowledgeGraph, damping: float = 0.85, max_iterations: int = 100, convergence_threshold: float = 1e-06)
Initialize PageRank calculator.
Parameters 4
graph |
KnowledgeGraph |
KnowledgeGraph with page connections |
damping |
float |
Probability of following links vs random jump (0-1) Default 0.85 means 85% follow links, 15% random jump |
max_iterations |
int |
Maximum iterations before stopping |
convergence_threshold |
float |
Stop when max score change < this value |
Functions
analyze_page_importance
Convenience function to analyze page importance.
analyze_page_importance
def analyze_page_importance(graph: KnowledgeGraph, damping: float = 0.85, top_n: int = 20) -> list[tuple[Page, float]]
Convenience function to analyze page importance.
Parameters 3
| Name | Type | Default | Description |
|---|---|---|---|
graph |
KnowledgeGraph |
— | KnowledgeGraph with page connections |
damping |
float |
0.85 |
Damping factor (default 0.85) |
top_n |
int |
20 |
Number of top pages to return |
Returns
List of (page, score) tuples for top N pageslist[tuple[Page, float]]
—