Graph Analysis

Analyze your site's structure, improve internal linking, and optimize navigation

8 min read 1683 words
Edit this page

Was this page helpful?

Bengal's graph analysis tools help you understand your site's structure, identify optimization opportunities, and improve internal linking.

Overview

Bengal uses a semantic link model that understands different types of connections:

Link Type Weight Description
Explicit 1.0 Human-authored markdown links
Menu 10.0 Navigation menu items (high editorial intent)
Taxonomy 1.0 Shared tags/categories
Related 0.75 Algorithm-computed related posts
Topical 0.5 Section hierarchy (parent → child)
Sequential 0.25 Next/prev navigation

This weighted approach provides nuanced connectivity analysis beyond simple orphan detection, eliminating false positives from structural links.

Connectivity Levels

Pages are classified by their weighted connectivity score:

Level Score Status Action
🟢 Well-Connected ≥ 2.0 Excellent No action needed
🟡 Adequately Linked 1.0 - 2.0 Good Could add more links
🟠 Lightly Linked 0.25 - 1.0 Fair Should add explicit cross-references
🔴 Isolated < 0.25 Needs attention Add internal links

Quick Start

Get a unified analysis report:

BASH
bengal graph report

Example: Output

PLAINTEXT
================================================================================
📊 Site Analysis Report
================================================================================

📈 Overview
   Total pages:        124
   Total links:        316
   Avg links/page:     2.5
   Avg conn. score:    1.46
   Communities:        0

🔗 Connectivity Distribution
   🟢 Well-Connected:      39 pages (31.5%)
   🟡 Adequately:          38 pages (30.6%)
   🟠 Lightly Linked:      26 pages (21.0%)
   🔴 Isolated:            21 pages (16.9%) ⚠️

🔴 Isolated Pages (need attention)
      • content/_index.md
      • content/docs/_index.md
      ... and 19 more

💡 Recommendations
   • Add explicit cross-references to isolated pages
   • Add internal links to lightly-linked pages
================================================================================

For CI pipelines:

BASH
bengal graph report --brief --ci --threshold-isolated 5

Example: CI Output

PLAINTEXT
📊 Site Analysis: 124 pages
   Isolated: 21 (16.9%) ⚠️
   Lightly linked: 26 (21.0%)
   Avg score: 1.46 (good)
✅ CI PASSED: 21 isolated pages within threshold (25)

Understanding the Analysis

Bengal tracks different types of links with semantic meaning:

Link Type Weight Description
Explicit 1.0 Human-authored[text](url)markdown links
Menu 10.0 Navigation menu items (deliberate prominence)
Taxonomy 1.0 Shared tags or categories
Related 0.75 Algorithm-computed related posts
Topical 0.5 Section hierarchy (parent_index.md→ children)
Sequential 0.25 Next/prev navigation within section

Key Metrics

  • Connectivity score: Weighted sum of all incoming links
  • Average score: Mean connectivity score across all pages (aim for ≥1.0)
  • Isolated pages: Pages with score < 0.25 (need attention)
  • Distribution: Percentage of pages at each connectivity level
  • Hub pages: Pages with many incoming links (important for navigation)

Command Reference

Unified Analysis Report

Get a comprehensive connectivity report:

BASH
bengal graph report

Options:

  • --brief: Compact output for CI pipelines
  • --ci: CI mode with exit codes
  • --threshold-isolated N: Max isolated pages before failure (default: 5)
  • --threshold-lightly N: Max lightly-linked before warning (default: 20)
  • --format FORMAT: Output format - console, json, or markdown

List Under-Linked Pages

Find pages by connectivity level:

BASH
# Show isolated pages (default)
bengal graph orphans

# Show lightly-linked pages
bengal graph orphans --level lightly

# Show both isolated and lightly-linked
bengal graph orphans --level all

Example: Output

PLAINTEXT
📊 Connectivity Distribution
==========================================================================================
  🟢 Well-Connected (≥2.0):      39 pages (31.5%)
  🟡 Adequately Linked (1-2):    38 pages (30.6%)
  🟠 Lightly Linked (0.25-1):    26 pages (21.0%)
  🔴 Isolated (<0.25):           21 pages (16.9%)
==========================================================================================

🟠 Lightly Linked Pages (26 total)
==========================================================================================
#    Score    Path                                          Title
------------------------------------------------------------------------------------------
1    0.50     content/authors/lbliii.md                     Lawrence Lane
2    0.50     content/docs/build-sites/structure/analysis/graph.md        Graph Analysis
3    0.75     content/docs/about/glossary.md                Glossary
4    0.75     content/docs/reference/cheatsheet.md          Cheatsheet
...
==========================================================================================

JSON output (with --format json):

JSON
{
  "level_filter": "lightly",
  "distribution": {"isolated": 21, "lightly_linked": 26, ...},
  "pages": [
    {
      "path": "content/docs/about/glossary.md",
      "title": "Glossary",
      "score": 0.75,
      "metrics": {
        "explicit": 0,
        "menu": 0,
        "taxonomy": 0,
        "topical": 1,
        "sequential": 1
      }
    }
  ]
}

Options:

  • --level LEVEL: Filter by level - isolated, lightly, adequately, all
  • --format FORMAT: Output format - table, json, paths
  • --sort FIELD: Sort by path, title, or score
  • --limit N: Limit results

Analyze Site Structure

Get an overview of your site's connectivity:

BASH
bengal graph analyze

Options:

  • --tree: Show site structure as a tree visualization
  • --output path/to/graph.html: Generate interactive HTML visualization
  • --config PATH: Path to config file (default: bengal.toml)

Short aliases:

BASH
bengal g report        # g → graph
bengal analyze         # Top-level alias

PageRank Analysis

Identify your most important pages using Google's PageRank algorithm:

BASH
bengal graph pagerank --top-n 10

Example: Output

PLAINTEXT
🏆 Top 10 Pages by PageRank
====================================================================================================
Analyzed 124 pages • Converged in 55 iterations • Damping: 0.85
====================================================================================================
Rank   Title                                         Score        In    Out
----------------------------------------------------------------------------------------------------
1      Template Functions Reference                  0.04669      7.5   2
2      Templating                                    0.03515      6     1
3      Analysis System                               0.02980      2.0   2
4      Health Check System                           0.02592      3.0   2
5      Theme Variables Reference                     0.02559      5.5   2
====================================================================================================

📊 Insights
• Average PageRank score:     0.007300
• Top 10% threshold:          12 pages (score ≥ 0.016534)
• Score concentration:        Moderate

Options:

  • --top-n N: Show top N pages (default: 20)
  • --damping FLOAT: PageRank damping factor (default: 0.85)
  • --format FORMAT: Output format - table, json, csv, or summary

Community Detection

Discover topical clusters in your content:

BASH
bengal graph communities --top-n 10 --min-size 3

Options:

  • --min-size N: Minimum community size to show (default: 2)
  • --resolution FLOAT: Resolution parameter - higher = more communities (default: 1.0)
  • --top-n N: Number of communities to show (default: 10)

Bridge Pages Analysis

Find critical navigation pages:

BASH
bengal graph bridges --top-n 10

Example: Output

PLAINTEXT
====================================================================================================
🌉 Navigation Path Analysis
====================================================================================================
Analyzed 124 pages • Avg path: 7.05 • Diameter: 19
====================================================================================================

🔗 Top 10 Bridge Pages (Betweenness Centrality)
----------------------------------------------------------------------------------------------------
Rank   Title                                              Betweenness    In    Out
----------------------------------------------------------------------------------------------------
1      Icon Reference                                     0.1419         6.0   5
2      Navigation Directives                              0.1278         3.75  3
3      Object Model                                       0.1142         0.75  2
4      Build Cache                                        0.1117         2.0   2
5      Site Templates Reference                           0.1037         1.0   5

🎯 Top 10 Most Accessible Pages (Closeness Centrality)
----------------------------------------------------------------------------------------------------
Rank   Title                                              Closeness      Out
----------------------------------------------------------------------------------------------------
1      Authors                                            1.0000         1
2      Content Organization                               1.0000         2
3      Template Functions Reference                       1.0000         2
====================================================================================================

📊 Insights
• Average path length:        7.05 hops
• Network diameter:           19 hops
• Structure:                  Deep (consider shortening paths)

Metrics:

  • Betweenness: Pages that connect different parts of the site (bridge pages)
  • Closeness: Pages easy to reach from anywhere (accessible pages)

Get smart recommendations for internal linking:

BASH
bengal graph suggest --top-n 50 --min-score 0.5

Options:

  • --top-n N: Number of suggestions to show (default: 50)
  • --min-score FLOAT: Minimum score threshold (default: 0.3)
  • --format FORMAT: Output format - table, json, or markdown

Common Workflows

Improving Internal Linking

  1. Get connectivity report:

    BASH
       bengal graph report
    
  2. Find isolated pages (highest priority):

    BASH
       bengal graph orphans --level isolated
    
  3. Find lightly-linked pages (could improve):

    BASH
       bengal graph orphans --level lightly --format json > lightly-linked.json
    
  4. Get link suggestions:

    BASH
       bengal graph suggest --min-score 0.5 --format markdown > suggestions.md
    
  5. Prioritize by importance:

    BASH
       bengal graph pagerank --top-n 30 --format csv > important-pages.csv
    

CI Integration

Add connectivity checks to your CI pipeline:

BASH
# Fail build if too many isolated pages
bengal graph report --ci --threshold-isolated 5

# Export JSON for artifact storage
bengal graph report --format json > connectivity-report.json
bengal graph orphans --format json > under-linked-pages.json

Optimizing Navigation

  1. Find bridge pages:

    BASH
       bengal graph bridges --metric betweenness --top-n 10
    
  2. Ensure bridge pages are prominent — Add to main navigation menus.

  3. Check accessibility:

    BASH
       bengal graph bridges --metric closeness --top-n 10
    

Understanding Connectivity Levels

Thereportcommand classifies pages by weighted connectivity:

Level Indicator Meaning Action
🔴 Isolated Score < 0.25 No meaningful links Add explicit cross-references
🟠 Lightly Linked Score 0.25-1.0 Only structural links Add internal links
🟡 Adequately Linked Score 1.0-2.0 Some connections Could add more
🟢 Well-Connected Score ≥ 2.0 Multiple link types No action needed

Why Use Weighted Scores?

Binary orphan detection causes false positives:

  • Pages in section hierarchies havetopical links from parent _index.md
  • Pages with next/prev navigation havesequentiallinks
  • These count as connections but carry low editorial intent

Weighted scores give credit for structural links while highlighting pages that need explicit human-authored links.

Additional Recommendations

Recommendation Meaning Action
🌉 Bridge Pages Pages connecting site sections Make prominent in menus
🏆 Hub Pages Pages with many incoming links Keep updated and maintained
📊 Low Average Score Overall connectivity < 1.0 Add more internal links site-wide

Best Practices

Regular Analysis

Run analysis regularly to track improvements:

BASH
# Save connectivity report
bengal graph report --format json > analysis-$(date +%Y%m%d).json

# Track isolated pages over time
bengal graph orphans --format json > isolated-$(date +%Y%m%d).json

# PageRank for importance
bengal graph pagerank --format csv > pagerank-$(date +%Y%m%d).csv

Focus on High-Impact Changes

  1. Fix isolated pages first (🔴) — Highest priority
  2. Improve lightly-linked pages (🟠) — Add explicit cross-references
  3. Link to high PageRank pages — Maximum SEO benefit
  4. Improve bridge pages — Better navigation
  5. Add internal links gradually — Don't over-link

CI/CD Integration

Add connectivity gates to your pipeline:

YAML
# GitHub Actions example
- name: Check connectivity
  run: bengal graph report --ci --threshold-isolated 5

Seealso