ClearDraft

ClearDraft

The evolution of online news headlines


“Unveiling the Transformation of Online News Headlines: A Comparative Analysis”


Analyze multiple datasets contributing to a collection of news headlines from various outlets. Explore features such as headline length, sentiment, and syntactic structure to understand trends in journalism.

In our investigation, we gathered data from various datasets to analyze headlines over time. The BIG4 dataset includes headlines from well-known news outlets like The New York Times and The Guardian from the early 2000s, capturing the transition to online journalism. Additionally, the News on the Web corpus (NOW) offers a broader range of news websites from 2010 onwards. We also incorporated datasets of clickbait headlines and scientific preprint titles for comparison.

The BIG4 corpus covers a wide range, including headlines from The New York Times dating back to 1851, signaling the shift to online platforms. However, the ABC Australia dataset presented some challenges with discrepancies in headline matching. Despite these issues, we included this dataset in our analyses for a comprehensive review.

The NOW corpus, sourced from various English-language news websites, continuously updates with new articles to reflect the current media discourse. It provides a dynamic view of news over time, highlighting different outlets’ contributions to the dataset. Our analysis also included a clickbait-style corpus and a corpus of scientific preprint titles for benchmarking purposes.

Utilizing natural language processing techniques, we cleaned and analyzed the headlines, including sentiment analysis and syntactic structure examination. Our statistical analysis involved linear regressions for continuous features and logistic regressions for binary features to understand trends over time.

We categorized outlets based on political leaning and journalistic quality using established media bias charts. The AllSides Media Bias Chart categorized outlets into left-leaning, center, and right-leaning, while the Ad Fontes Media Bias Chart assessed journalistic quality based on a green-yellow-red scale.

Our findings revealed a significant increase in headline length over time, prompting further exploration of its correlation with other linguistic features. We acknowledge the complex causal relationships within the data-generating process and focus on analyzing descriptive trends without implying causality.

We provide open access to our code and datasets for transparency and reproducibility in our analyses.


Published on: 2025-03-13 00:00:00 | Author:

🔗 Source
The Self-Care Revolution: Understanding the Psychology Behind Today’s Wellness Trends

The Self-Care Revolution: Understanding the Psychology Behind Today’s Wellness Trends

The Self-Care Revolution: Understanding the Psychology Behind Today’s Wellness Trends Title: Exploring The Self-Care Revolution: Understanding Psychology and Wellness Trends…
Fluctuating activity and light exposure patterns linked to depression

Fluctuating activity and light exposure patterns linked to depression

“Study: Fluctuating Activity and Light Exposure Tied to Depression Risk” Disruptions in daily activity and light exposure patterns are linked…
What if you lose a parent at a young age? 'Grief lasts a lifetime'

What if you lose a parent at a young age? 'Grief lasts a lifetime'

The Lasting Impact of Losing a Parent Early in Life: Navigating Grief for a Lifetime Childhood parental loss impacts adult…
The Future of Healing: Exploring the Benefits of AI Therapy

The Future of Healing: Exploring the Benefits of AI Therapy

The Future of Healing: Exploring the Benefits of AI Therapy Title: Embracing AI Therapy for Mental Health: A New Era…
Dopamine and social media: Why you can’t stop scrolling, according to neuroscience

Dopamine and social media: Why you can’t stop scrolling, according to neuroscience

“The Neuroscience Behind Your Social Media Addiction: The Role of Dopamine in Endless Scrolling” Discover how social media rewires young…
Therapy in the Digital Age: The Role of AI in Mental Health Care

Therapy in the Digital Age: The Role of AI in Mental Health Care

Therapy in the Digital Age: The Role of AI in Mental Health Care Title: Advancing Mental Health Care: The Impact…
Why thinking about aging is so complex when you're a woman in your 20s today

Why thinking about aging is so complex when you're a woman in your 20s today

“The Modern Dilemma: Navigating Aging as a Woman in Your 20s” Young women facing societal pressure to maintain eternal youth…
The Science of Bouncing Back: Understanding Mental Resilience

The Science of Bouncing Back: Understanding Mental Resilience

The Science of Bouncing Back: Understanding Mental Resilience Title: The Science behind Mental Resilience: How to Bounce Back from Challenges…
Individuals with bipolar disorder face increased cardiovascular risk, study finds

Individuals with bipolar disorder face increased cardiovascular risk, study finds

“Bipolar Disorder Linked to Higher Cardiovascular Risk, New Study Reveals” Individuals with bipolar disorder have a 3.1% risk of cardiovascular…
Retiring abroad puts older adults at risk for loneliness, study finds

Retiring abroad puts older adults at risk for loneliness, study finds

Study: Retiring Overseas Increases Risk of Loneliness Among Older Adults Study finds that retirees moving abroad may face increased loneliness,…

Copyright ©cleardraft 2025