Methodology

How PolitiLens measures, scores, and presents political news coverage

PolitiLens is a framing and coverage analysis tool — not a fact-checking service.

Scores are heuristics, outlet positions are editorial estimates, and AI summaries describe framing — not truth. Always read original sources.

Outlet Positions

Estimated editorial position

Each outlet is placed on a 2D political compass with two axes:

x-axisEconomic policy — from interventionist (−1, left) to market-oriented (+1, right)
y-axisSocial values — from libertarian (−1, bottom) to authoritarian (+1, top)

Positions are manually curated editorial estimates informed by established media research (AllSides, Ad Fontes Media, and others). They are not algorithmic outputs or objective measurements. No two researchers will agree perfectly on outlet placement, and outlets shift over time. Treat these as approximate starting points, not ground truth.

Estimated editorial positions — not objective measurements. Manually curated and subject to revision. Currently covering 91 outlets across 6 regions and all major political leans.

Story Clustering

Articles from all outlets are grouped into story clusters — each cluster represents articles covering the same event or topic from different sources.

Grouping uses Jaccard similarity on the tokenized text of each article's title and summary:

similarity = |tokens_A ∩ tokens_B| / |tokens_A ∪ tokens_B|

threshold:  ≥ 0.22  →  same story
minimum:    2 articles per cluster
stop words: ~100 common words excluded
token min:  4 characters

Clusters are sorted by outlet count × divergence score, then limited to the top 25 for performance. Named entities (people, places, organizations) are extracted from each cluster using the compromise NLP library.

Articles are grouped by keyword similarity. Clustering is approximate and may over- or under-group related stories.

Divergence Score

The divergence score (0–100) measures how widely the political spectrum covers a story. A high score means the story is being covered across different political leans and/or regions — not that any outlet is wrong or misleading.

divergence = (unique_outlets / 5) × 60         ← breadth (max 60)
           + 40  if both left AND right outlets present  ← polarity bonus
           + 10  if ≥ 2 regions covered                 ← regional bonus

ranges:
  0–29   low      → mostly one side covering it
  30–54  moderate → some cross-spectrum coverage
  55–74  high     → significant left/right difference
  75–100 extreme  → polar opposite coverage

Measures how differently the political spectrum covers a story. Not a measure of accuracy or truth.

Sentiment / Tone

Sentiment is computed with the AFINN lexicon (via the sentiment npm package). Each word in an article's title and summary is looked up in a list of ~3,500 English words with pre-assigned scores from −5 (very negative) to +5 (very positive).

Scores are averaged across all articles in a cluster to produce a cluster-level tone. The tone label is determined by the comparative score (total ÷ word count):

comparative > 0.05  → positive
comparative < -0.05 → negative
otherwise          → neutral

Reflects language tone (word choice), not factual correctness. Negative tone ≠ false or biased. AFINN does not handle irony, sarcasm, or domain-specific political language well. A "negative" story about war is not the same as a "negative" opinion article.

AI Framing Analysis

Framing analysis — not a factual verdict

When you click "Analyze Framing" on a story, PolitiLens sends up to 6 article titles and summaries to LLaMA 3.3-70B (via OpenRouter's free tier) with the following system prompt:

Analyze how different political outlets are covering this story.
Be specific about framing differences and loaded language.
Do not editorialize — describe factually how each side
is presenting this.

The model returns a structured JSON object (validated with Zod) containing:

One-sentence story summary
How left-leaning, center, and right-leaning outlets frame the story
Loaded or emotionally charged phrases (up to 6)
Facts all outlets agree on (up to 4)
The most interesting framing divergence between left and right

Describes how outlets frame a story. Does not fact-check claims or rank outlets by credibility. Article text is treated as untrusted input. The AI cannot access the full articles — only titles and the first ~300 characters of each summary. Rate limited to 8 analyses per minute per IP.

Political Temperature

The temperature gauge shown in the Daily Briefing is a composite of two signals:

temperature = (avg_divergence × 0.6) + (min(sentiment_variance × 50, 100) × 0.4)

labels:
  ≥ 75  → Volatile
  ≥ 55  → Hot
  ≥ 35  → Warm
  < 35  → Cold

Sentiment variance measures how much article tones differ from each other across the current news cycle. High variance means some outlets are very positive while others are very negative — a sign of polarized framing.

Limitations

▸Coverage is limited to 63 configured RSS outlets. Many perspectives, languages, and regions are not represented.
▸Outlet positions are manually configured and may not reflect recent editorial shifts or internal diversity within an outlet.
▸Story clustering uses keyword overlap — it cannot understand semantic meaning. Two articles using different vocabulary for the same story may not be grouped.
▸AI framing analysis operates only on titles and ~300-character summaries, not full article text. It may miss nuance in the full article.
▸Sentiment scoring uses a general-purpose English lexicon and does not handle domain-specific language, irony, or code-switching.
▸GDELT world data (Global Coverage page) is broad but noisy. Country attributions can be unreliable for ambiguous stories.
▸Fact-check verdicts are from PolitiFact and FactCheck.org — two US-focused organizations. International claims are underrepresented.
▸Data freshness: news clusters are cached for 5 minutes; briefing for 1 hour; world/congress for 15 minutes.
▸This tool cannot determine which outlet is 'more accurate' — that requires domain expertise and source verification beyond automated analysis.

Data Sources

63 RSS feedsPrimary news feed from each configured outlet. Fetched with retry/backoff logic and health monitoring.

Currents APISupplementary global news aggregation. Mapped to outlet lean by domain matching.

Guardian APIGuardian content with richer metadata than the RSS feed.

GDELTGlobal Database of Events, Language, and Tone. Powers the world news intensity map. Updated every 15 minutes.

Congress.gov APIRecent House bills and votes. Requires CONGRESS_API_KEY.

PolitiFactUS-focused fact-check verdicts via RSS.

FactCheck.orgIndependent US fact-checking organization via RSS.

Federal RegisterUS government regulatory documents and executive actions.

PixabayFree stock images used as story cover images when available.

OpenRouterLLM API gateway. PolitiLens uses LLaMA 3.3-70B (free tier) for framing analysis and translations.

Last reviewed: May 2026 · Source code: github.com/zhenxiao-yu/politilensPolitiLens is a framing and coverage analysis tool — not a fact-checking service.

similarity = |tokens_A ∩ tokens_B| / |tokens_A ∪ tokens_B| threshold: ≥ 0.22 → same story minimum: 2 articles per cluster stop words: ~100 common words excluded token min: 4 characters

divergence = (unique_outlets / 5) × 60 ← breadth (max 60) + 40 if both left AND right outlets present ← polarity bonus + 10 if ≥ 2 regions covered ← regional bonus ranges: 0–29 low → mostly one side covering it 30–54 moderate → some cross-spectrum coverage 55–74 high → significant left/right difference 75–100 extreme → polar opposite coverage

Analyze how different political outlets are covering this story. Be specific about framing differences and loaded language. Do not editorialize — describe factually how each side is presenting this.