Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
146570 stories
·
33 followers

Amazon is reportedly in talks to invest $50 billion in OpenAI

1 Share
If a deal materializes, it would mean Amazon is backing competing startups in the race for AI supremacy.
Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Medium gives employees Friday off to participate in national strike protesting ICE

1 Share
Activists behind the general strike are calling for "no work, no school, and no shopping" amid a push to defund ICE, which has escalated raids in U.S. cities, killing several people, including two U.S. citizens earlier this month in Minneapolis.
Read the whole story
alvinashcraft
41 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Electronic Frontier Foundation calls for stronger privacy with Encrypt It Already campaign

1 Share
Electronic Frontier Foundation (EFF) has kicked off a new campaign calling on technology companies to offer better privacy protections to users. The Encrypt It Already campaign is practically begging for end-to-end encryption to be implemented across data and communication systems by default. EFF points to the likes of WhatsApp and Signal – both of which already offer end-to-end encryption as being good examples of ensuring that only sender and recipient are able to access messages. Recognizing that the approach taken will be different depending on the product and the circumstances, EFF has a number of suggestions. With WhatsApp and Signal… [Continue Reading]
Read the whole story
alvinashcraft
54 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Scaling enterprise AI: lessons in governance and operating models from IBM

1 Share
Successful implementation and scaling of enterprise AI projects is fundamentally a people and operating model challenge, not just a technology problem.
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

2025 Transparency Report Overview

1 Share

When we set out to build Bluesky, we started from a simple premise: social media is broken at its foundation. For too long, platforms have treated people as products—optimizing for engagement over wellbeing, prioritizing profit over safety, and concentrating power in the hands of a few companies and individuals rather than sharing it with users and communities. We are now paying for these decisions as the attention economy pervades society and rewards performance over substance.

Bluesky's mission is to fix the structural issues with social media by transitioning from platforms to protocols. We believe the future of social networking should be open, give users genuine control over their data and experience, and create accountability through transparency. That means building differently—not just better moderation policies, but a fundamentally different architecture that puts power back in users' hands.

In 2025, as Bluesky grew from 25 million to 41 million users, we improved the trust and safety infrastructure to better enable that mission. Here's what that looks like in plain language:

Our moderation approach is human-centered. We use automated systems to catch the bad stuff at scale—spam networks, child exploitation, and coordinated attacks. For everything requiring context, nuance, or cultural understanding, trained human moderators make the call. When you report harassment, hate speech, or other complex harms, a person reviews it against our Community Guidelines.

We are building transparency into our systems. Our new strike system is continually evolving to inform you which policy you violated, how serious it was, and where you stand. When we make mistakes—which we will—our appeals process involves human review, not just automated rejection.

The most important developments in 2025 were:

We dramatically reduced toxic content through proactive design. By October, our reply filtering system had decreased user reports of anti-social behavior by 79%. This wasn't about removing speech or banning users—it was about using product design to reduce the visibility of toxic replies while keeping them accessible to those who want to see them. Small design choices, thoughtfully implemented, created measurably healthier conversations.

We implemented age assurance across multiple jurisdictions. This was genuinely hard. Nearly every country and US state has different, often conflicting requirements, and some well-intentioned laws also created serious barriers to privacy and free expression. We built custom paths for the UK, Australia, and multiple US states, threading the needle between legal compliance and our values. When Mississippi's law forced impossible choices, we initially blocked access rather than compromise on privacy—then built better systems and restored access for adults later in the year.

We scaled our moderation operations to match our unprecedented growth. In 2025, users created 1.41 billion posts—61% of all posts ever made on Bluesky. We maintained 24/7 moderation coverage, reviewed 9.97 million user reports, applied 16.49 million labels, and removed 2.45 million pieces of violating content. We brought our spam and bot detection capabilities to a level where misleading content reports dropped month over month even as our user base grew. We built a verification system that includes both centralized verification from Bluesky and a network of Trusted Verifiers—news organizations, universities, and cities have signed up institutions that can verify their own communities.

This work required substantial investment and difficult choices. We were forced to make hard trade-offs between speed and accuracy, between removing harm and preserving expression. We didn't get everything right. But we further built the foundation for trust and safety systems that can scale while staying true to our mission.

Thank you for being part of building something better.

Read the full Bluesky 2025 Transparency Report

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Bluesky 2025 Transparency Report

1 Share

Throughout 2025, our Trust & Safety work was guided by the premise that social media should serve the people using it. That means reducing exposure to toxic content, protecting young people while respecting privacy, and building systems that can respond quickly to emerging harms. Our moderation systems scaled to ably manage large, event-driven volumes, and we invested heavily in regulatory readiness, safety systems, and product improvements that strengthen Bluesky and the entire AT Protocol ecosystem.

Our transparency report provides data and details about those investments, and the progress we made this year.

Overview

Bluesky grew nearly 60% in 2025, from 25.94M to 41.41M users. This includes accounts hosted on Bluesky's infrastructure, as well as the thousands of Personal Data Servers operated by people across the federated AT Protocol network, with the majority hosted independently by third parties. This decentralized architecture remains core to our mission of transitioning the social web from platforms to protocols, giving users genuine control over their data and experience.

The growth in our user base came alongside a dramatic expansion in site activity. Users created 1.41 billion posts during 2025—representing 61% of all posts ever made on Bluesky. Media sharing surged, with 235M posts containing photos, videos, or other media in 2025, accounting for 62% of all media posts in the site's history.

As the largest host of accounts and default port of entry for people joining Bluesky, we maintained 24/7 moderation operations throughout 2025, with specialized teams focused on critical areas like child safety. Moderation at scale requires a hybrid approach: We pair automated systems (like rules engines that detect and combat spam in real time) with comprehensive human oversight, including human review of all appeals and edge cases.

Our work in 2025 concentrated on five key areas: proactive content moderation, age assurance across multiple jurisdictions, enhancing policy guidelines and moderation tools, account verification, and regulatory compliance infrastructure. The following sections detail each initiative along with our operational metrics and additional details about our safety-related legal compliance work.


PART I: 2025 KEY INITIATIVES

Toxicity Filtering

Toxicity is a persistent challenge for all large-scale social apps. As communities grow, maintaining space for both friendly conversation and fierce disagreement requires intentional design choices. Our community doubled in size over the past year, and with that growth came tension: how to preserve healthy discourse while respecting genuine debate and diverse user preferences.

Toxic and inflammatory discourse appears across all forms of social media; and almost universally, it's the case that a small percentage of people contribute disproportionately to causing this problem. A tiny number of users can have an outsize impact on conversation quality — and on people's willingness to participate. In 2023-2024, anti-social behavior, such as harassment, trolling, and intolerance, consistently ranked among our top complaints reported by users. This content drives people away from forming connections, posting, or engaging, for fear of attacks and pile-ons.

In 2024, we introduced features that gave users more control over their threads: the ability to detach quote posts to limit pile-ons, and tools to hide replies on posts you created. Building on that foundation, we began testing whether product design could also work proactively to improve discourse quality across the network.

In April 2025, we started with lists, to address a growing trend of users engaging in targeted abuse via list names, and to see if a small-scale detection integrated into moderation workflows could generate positive results. When a user reports a list, we automatically assess both the title and the description for overt toxicity and apply a !hide label to the list. This treatment makes the list invisible to anyone but the list creator, and if they adjust the list name and description, we remove the label on appeal. Previously mods would assess and take down toxic lists, but this would be frustrating for users that felt they lost all their curational work. This approach allowed us to balance reducing harm, with giving users the chance to address the toxicity in their content. This change significantly cleaned up list quality on Bluesky, so that people used them for their intended curational or moderation purposes and cut down on abusive use cases.

In October, we began experimenting with improving conversation quality, starting with replies. Rather than only reacting after users report abusive or toxic interactions, we launched an experiment to identify replies that are toxic, spammy, off-topic, or posted in bad faith, and reduce their visibility in the Bluesky app. This approach adds friction — most viewers casually scanning a conversation won't encounter the toxic or potentially harmful replies — while preserving content access in case we get it wrong. These replies remain accessible in the thread for those who want to see them. We also made sure this feature is aware of who you follow: Replies from accounts you follow appear above the fold, while toxic replies from people you don't follow require an additional click to view.

After implementing this detection, daily reports of anti-social behavior dropped by approximately 79%. This reduction demonstrates measurable improvement in user experience: People are encountering substantially less toxicity in their day-to-day interactions on Bluesky.

Verification System

Identity verification is one of the bedrocks of social media — but there are lots of different approaches, not all of which provide helpful signal about account authenticity and provenance. To address this need on Bluesky, this year we launched a new verification system designed to help users identify authentic accounts of public interest. Building on our existing domain handle verification (which over 309,000 accounts use to link their Bluesky username to their website), we introduced visual verification badges as an additional layer of trust and clarity based on significant feedback from users who pushed for a centralized verification solution.

Verification on Bluesky goes beyond the decisions we make ourselves; our approach includes both direct verification by Bluesky and verifications performed by a network of Trusted Verifiers, independent organizations that can verify accounts within their domains of expertise. This reflects our belief that trust isn't only established centrally; it also comes from communities themselves. News organizations can verify their journalists, universities can verify faculty members, and sports leagues can verify their athletes. In 2025, 18 organizations were granted Trusted Verification on Bluesky, from Wired and CNN, to the European Commission and the City of Toronto.

By the end of 2025, we had verified 4,327 accounts total: 3,567 verified directly by Bluesky and 777 verified by our network of 21 Trusted Verifiers. This approach allows us to expand verification while drawing on the expertise of organizations best positioned to authenticate accounts within their own communities.

Age Assurance

We implemented age assurance measures in the Bluesky app for the first time, responding to new regulatory requirements in specific jurisdictions. This was new territory for us. Balancing legal compliance with our values of openness, privacy, and user control required significant engineering and operational investment.

The Challenge of Multiple Jurisdictions

Different countries and states have different — sometimes conflicting — views on how platforms should verify age and protect young people online. We recognize that promoting safety for young people is a shared responsibility, and we're committed to following the law. At the same time, we believe strongly in protecting user privacy and ensuring that safety measures don't create unnecessary barriers to free expression.

Threading this needle meant we couldn't use a single, one-size-fits-all solution. Instead, we built multiple age assurance approaches to meet specific local regulatory requirements while doing what we could to maximize user privacy and openness.

Implementation Across Jurisdictions

We work with Epic Games' Kids Web Services (KWS) to provide verification options in jurisdictions requiring age assurance. Implementations vary by location: some regions default to a child-friendly version with age verification for additional access; others require verification before accessing Bluesky. Different age thresholds and access levels apply depending on local law. Users outside these specific jurisdictions can continue using Bluesky as they always have.

2025 Age Assurance by Jurisdiction

Jurisdiction Accounts Age Assured Implementation Date
United Kingdom 274,163 July 2025
United States 52,924 Various (2025)
Australia 37,873 December 2025
Total 364,960

The total number of age-assured accounts listed above includes users previously verified via other KWS-enabled applications. These users are part of the KWS AgeGraph, a global network of over 30 million pre-verified users.

Building these jurisdiction-specific systems required substantial resources and difficult choices. In some cases, we faced laws we believe create significant barriers to free speech and disproportionately burden smaller teams like ours. For example, when Mississippi's age assurance law took effect in August, we had not yet built the technical infrastructure or operational capabilities needed to implement the proper systems. We initially made the difficult decision to block access to Bluesky entirely in the state of Mississippi. As we continued building age assurance systems for other jurisdictions throughout the year, our capabilities matured. By December, the work we'd done implementing age assurance in the UK, Australia, and multiple US states meant we could offer a solution for people 18 and over in Mississippi. While we continue to believe Mississippi's law raises serious concerns, we wanted to let adult users decide for themselves whether they're comfortable verifying their age to access Bluesky. This kind of pragmatic iteration — building our capabilities over time and expanding access where we can, while staying true to our principles — reflects our approach to these issues going forward.

Implementing age assurance across multiple jurisdictions was one of our largest operational undertakings in 2025, requiring custom engineering, privacy considerations, and ongoing compliance monitoring for each region.

Updated Policies & Enhanced Moderation Tools

In March 2025, we began a six-month process to update our Community Guidelines, inviting community feedback at multiple stages. More than 14,000 submissions shared perspectives, concerns, and examples of how different proposed elements of our guidelines might impact the communities on Bluesky. Implemented in October, the final Guidelines reflect this input while balancing safety, compliance, and our core values.

The updated Guidelines organize our policies into four foundational principles — Safety First, Respect Others, Be Authentic, and Follow the Rules — each containing specific policies that define our community standards:

Community Guidelines organized by four foundational principles: Safety First, Respect Others, Be Authentic, and Follow the Rules

Reporting System

Behind the public-facing Community Guidelines sits a classification system that structures how we understand and categorize potential harms across the app. This internal framework connects our 16 distinct Community Guidelines policies with 39 user reporting reasons, our labeling system, and enforcement actions, creating consistency across every moderation decision.

The framework serves multiple functions. It maps our policies to international regulatory requirements, ensuring compliance across jurisdictions while maintaining our principles. Operationally, it allows us to route incoming reports to specialized moderation teams—enabling moderators to develop deep expertise in specific policy areas like fraud, harassment, or intellectual property rather than requiring surface-level familiarity with everything. This specialization means faster, more nuanced decisions and reduces the cognitive burden that previously slowed response times.

Most visibly to users, this structure enabled us to expand reporting options from 6 broad categories to 39 specific reasons. Now report signals carry far more precision—routing immediately to the right moderation team, tagged with the specific metadata, and contributing to an improved understanding of patterns of harm.

The granularity this framework provides appears throughout this report: detailed breakdowns by policy category, severity analysis, and enforcement patterns that were previously impossible to track systematically. What users experience as clearer reporting options, our team experiences as sustainable infrastructure—and the data tells us both what's working and where we need to adapt.

Strike System

Not all rule-breaking content or behavior is equal, and this year we found that we needed a more proportionate approach to enforcement on Bluesky. In November, we introduced a strike system that brings consistency and proportionality to the application of our rules.

The system assigns four severity levels (low, moderate, high, and critical) to each of our policies, based on the potential harm that behavior could cause. Lower-risk violations prioritize warnings and temporary cooling-off periods, while higher-risk violations that cause serious harm to individuals or communities face stronger consequences. As violations accumulate, account-level actions escalate from brief to progressively longer suspensions to permanent bans. Critical violations—those demonstrating intent to abuse the site or posing immediate real-world harm—result in permanent removal on first offense.

The strike system makes this escalation automatic and proportionate to the violation involved, and makes enforcement-related communications enforcement to users much clearer: which policy they violated, the severity level, their violation count, and their potential risk of permanent suspension on future violations.

With the infrastructure now in place, we're positioned to track severity patterns systematically and report comprehensive data in 2026. This foundation lets us enforce consistently while learning from our own data about which approaches work best.


PART II: MODERATION METRICS & OPERATIONS

Reports

User reports are a critical signal in our moderation operations. When someone reports content on Bluesky, a human moderator reviews that report against our Community Guidelines. Reports help us identify potential harms, understand what our community finds problematic, and direct our moderation resources where they're most needed.

In 2025, users submitted 9.97M reports (a 54% increase from 2024's 6.48M reports). This growth closely tracked our 57% user growth over the same period. These reports came from 1.24M users, representing approximately 3% of our user base who took the time to flag concerning content. Users can report various types of content to Bluesky's moderation service. Users mainly report individual posts and accounts, but may also report issues across other features: lists, profiles, feed generators, starter packs, and direct messages.

Due to the shift in reporting schema, we've roughly categorized the previous system's reports under the current reporting system. Reports that would have been made under anti-social behaviour are now mapped to harassment, and spam falls under misleading.

Category Total Reports Percent of Total
Misleading 4.36M 43.73%
Harassment 1.99M 19.93%
Other 2.21M 22.14%
Sexual 1.35M 13.54%
Violence 24.7K 0.25%
Child Safety 25.5K 0.26%
Breaking Site Rules 16.7K 0.17%
Self-Harm 5.5K 0.05%

Report Volume and Patterns

Total Number of Reports by Category Over Time (Stacked Bar)

Throughout 2025, we saw a drop in user reports month over month. Reports per 1,000 Monthly Active Users (MAU) declined 50.9% from Jan to Dec potentially indicating reduced problematic content density. Part of this is due to the massive influx of users starting in Nov 2024 related to the US election that gave us significant review backlogs at the beginning of 2025; content awaiting review may have been reported multiple times, leading to an overall larger number of reports. In May 2025 we further expanded our staffing specifically focused on bots, spam, and other malicious scaled activity, likely contributing to a significant reduction in user reports in this area from May onwards. Our focus on toxicity delivered a rapid 79% drop in harassment and anti-social reports.

Understanding Multiple Reports

We prioritize reports based on severity and behavioral patterns rather than volume alone. Multiple reports help us identify patterns of harmful behavior, though reporting the same piece of content repeatedly doesn't accelerate our review.

In 2025, 72.9% of reported accounts received multiple reports, a pattern reflecting the complex realities of content moderation. Some accounts generate multiple legitimate reports through repeated violations, while others receive concentrated attention due to controversial content or coordinated reporting campaigns. Understanding what drives these patterns is essential to applying appropriate enforcement.

When accounts receive multiple reports, moderators assess whether they represent distinct violations across different posts or the same content reported repeatedly. Accounts that repeatedly violate our guidelines accumulate strikes through our graduated enforcement system, escalating from temporary suspensions to permanent removal based on violation severity and frequency. This process (detailed in the Takedowns section below) explains why an account may receive multiple reports before permanent suspension: we distinguish between users who make occasional mistakes and those engaged in persistent harmful behavior.

Controversial content often generates high report volume from users with genuinely different perspectives on whether it violates our guidelines. We assess each report based on our Community Guidelines rather than removing content based on report volume alone, protecting space for contentious but policy-compliant discourse. At the same time, coordinated abuse of the reporting feature to silence permissible speech violates our Community Guidelines. The distinction matters: many people independently reporting content they find harmful represents community concern, while coordinated campaigns targeting users for their viewpoints represent an attempt to weaponize our moderation systems.

Reports that don't result in enforcement actions still inform our work. Patterns in these reports reveal where our policies may need clearer articulation, where we should examine enforcement consistency, and where community expectations differ from our standards.

Breaking Down the Report Categories

Spam: The Pervasive Challenge

With 4.36M reports, misleading content was our most-reported category. Spam alone accounted for 2.49M of these, reflecting the persistent challenge of repetitive, unsolicited content. We categorize spam as "misleading" because it's fundamentally about site manipulation that erodes trust. The remaining reports in this category split between bot accounts (62.77K reports), which capture automated inauthentic behavior, and impersonation (14.68K reports), where users pretend to be someone else.

Harassment: A Gray Area

Harassment reports (1.99M total) encompassed a wide spectrum of behavior, from serious violations to everyday incivility. The new taxonomy distinguishes hate speech (55.40K reports), targeted harassment (42.52K reports), trolling (29.48K reports), and doxxing (3.17K reports) from more general unkindness.

The large volume in "Harassment-Other" (1.86M) reflects both legacy mapping from the older "Anti-Social Behavior" category and the reality that reported behavior often falls into a gray area. It may be rude or unkind without rising to the level of hate speech or targeted abuse. Our moderation team reviews these reports individually, recognizing that context determines whether behavior violates community standards. The cumulative burden of incivility, even when it doesn't violate guidelines, affects discourse quality and user experience at scale.

Sexual Content: Labeling vs. Harm

Sexual content reports (1.52M total) primarily concerned mislabeling rather than serious harm. When we introduced a specific "Unlabeled Adult Content" reporting option in November, it quickly became one of our most-used categories with 83.27K reports in six weeks alone. This reflects a common friction point: adult content that's allowed on Bluesky but should be properly labeled so users can control their experience. We will be making changes in 2026 to properly incentivize accurate labels on adult content.

Unlike other sites where adult content is suppressed or banned, self-labeling on Bluesky doesn't result in removal or downranking; it simply respects user choice. Proper self-labeling reduces the burden on both our moderation team and users who report unlabeled content, allowing us to focus resources on serious harms like non-consensual imagery and abuse content.

A much smaller subset of reports involved serious harms: non-consensual intimate imagery (7.52K), abuse content (6.12K), and deepfakes (2.02K), a harm vector reflecting evolving technological threats.

Bluesky allows consensual adult sexual content but requires proper labeling for user control and minor safety. We do run automation on all images and video to identify and label potential adult content, but even with an exceptionally high accuracy rate, with half a million posts with images and video get uploaded every 24 hours, if 1% have inaccurate detections, that results in up to 5000 pieces of media that users don't want to encounter as part of their experience on Bluesky.

Violence and Additional Categories

Violence-related reports (24.67K total) were led by threats or incitement (10.17K reports), followed by glorification of violence (6.63K reports) and extremist content (3.23K reports). Child safety reports (25.47K total) are discussed in detail in the Child Safety section of this report. Breaking Network Rules (16.7K) and Self-Harm (5.5K) rounded out our reporting categories, both representing less than 0.2% of total reports with submissions fairly evenly distributed across specific violation types.

Uncategorized Reports

One in five user reports were "Uncategorized" (2.21M reports). This grouping is substantial for several reasons: "Other" existed as a reporting option in both our old and new systems to capture novel or hard-to-categorize harms, and we also used this bucket to conservatively map older reports that we couldn't confidently place into specific new categories.

As users adopt the new reporting flow in the UI, we expect this general category to decrease while more specific categories increase, even if the actual prevalence of harms hasn't changed. We anticipate these shifts will reflect better categorization rather than changes in user behavior.

Where Reports Come From

Beyond posts, users also report issues across other features: lists, profiles, feed generators, starter packs, and direct messages. Together, these make up the different content types users can create on Bluesky.

Product Surface User Reports
DMs 134,951
Lists 119,344
Profiles 19,033
Feed Generators 7,338
Starter Packs 1,041

Direct Message Reports

In 2025, we received 134.95K DM reports from users who encountered spam, scams, and unwanted contact in private conversations. Nearly half of all DM reports (48.48%) involved misleading content—primarily spam operations and scam attempts that target users through direct outreach. Sexual content and harassment each accounted for roughly a tenth of DM reports.

Category DM Reports % of Total
Misleading 22,894 48.48%
Uncategorized 13,573 28.74%
Sexual 5,292 11.21%
Harassment 5,097 10.79%
Breaking Site Rules 205 0.43%
Child Safety 73 0.15%
Violence 57 0.12%
Self Harm 30 0.06%
Grand Total 47,221 100.00%

DM reporting gives our moderation team visibility into harmful behavior that occurs outside public view, allowing us to identify and remove accounts that systematically abuse private messaging for spam campaigns, harassment, or exploitation. When users report a DM, moderators review the reported message and surrounding messages for context to assess whether it violates our Community Guidelines. We abide by the US Stored Communications Act and do not provide DMs to third parties unless it's ordered by a court. Access is extremely limited and tracked internally.

Other Features

Users reported problematic content across other features as well: 119.34K reports for lists, 19.03K for profiles, 7.34K for feed generators, and 1.04K for Starter Packs.

These reports revealed specific abuse patterns. Lists — designed as a tool for users to curate their experience — were sometimes weaponized for harassment, violating our policy against using moderation lists for abuse. Reported profiles involved slurs and coded hateful language embedded in usernames or descriptions. Features like Follows and Starter Packs became targets for spam and site manipulation, with bad actors using mass-following and coordinated Starter Pack distribution to game visibility and reach.

While these surfaces generate far fewer reports than posts, the harms they enable — harassment, hate speech, and site manipulation — require specialized attention to address effectively. Our content-level enforcement detailed in the Takedowns section includes actions taken across all these features.

Proactive Detection

Proactive detection represents our shift from reactive moderation — responding to problems after users experience them — to proactive systems that aim to identify and address harms before they spread. While much of this work is automated, it also includes specialized human monitoring for coordinated manipulation and event-driven review during high-activity periods.

Automated Detection

Our automated detection systems identify potential violations before users encounter them or report them. These systems focus on categories where we can either identify patterns in behaviour (like spam patterns and known bot attack signatures) or where content is harmful (such as Child Sexual Abuse Material via CSAM hashes) that catching it proactively protects both our community and our moderation team from exposure.

Our detection infrastructure combines heuristics, pattern matching, hash databases, and machine learning models trained specifically for content moderation. In 2025, automated systems flagged 2.54M potential violations across accounts and posts. As we continue building out these capabilities, expanding and refining our detection models will remain a priority area for 2026.

Not every flagged item represents an actual violation. High-confidence signals may result in immediate action, while lower-confidence flags route to human moderators for review before enforcement. This design ensures speed where we have certainty while maintaining human judgment for edge cases and context-dependent situations.

Influence Operations

Influence operations are coordinated campaigns that attempt to manipulate public discourse through deceptive or inauthentic behavior. These efforts typically involve networks of accounts working together to amplify narratives while obscuring their true origin, intent, or sponsorship.

In 2025, we developed foundational capabilities to identify, track, and disrupt suspected influence operations on Bluesky along with hiring our heuristics analyst for dedicated investigations. This included expanding internal monitoring systems, strengthening investigative workflows, and increasing the range of enforcement actions available to address coordinated inauthentic activity. We also collaborated with public-sector partners and independent researchers to investigate and mitigate suspected influence operation campaigns. This work is ongoing and will continue to mature in 2026.

Our approach to identifying suspected influence operations relies primarily on behavioral patterns and technical indicators—such as coordination signals and account relationships—rather than on content analysis alone.

During the reporting period, we observed activity consistent with known influence operation tactics, including impersonation of journalists and researchers, the use of inauthentic or misleading account identities, and the deployment of generative media. We also observed attempts to leverage the interoperability of decentralized and federated services, such as operating accounts on external platforms and using bridging services to bring that content onto Bluesky. While we are able to take enforcement action on bridged content within our own network, these approaches present emerging challenges for detection and attribution and underscore the importance of continued cross-platform collaboration.

Enforcement actions

During the reporting period, we removed 3619 accounts for involvement in suspected influence operations. Most of these accounts were assessed as linked to foreign state-aligned actors, including actors likely operating from Russia, with smaller numbers associated with other suspected state-aligned campaigns.

This figure reflects accounts identified through a combination of manual investigations and automated systems specifically designed to surface influence operation activity. The true number of accounts impacted is likely higher, as additional accounts may have been removed through standard enforcement processes or broader automated systems without being explicitly attributed to influence operations.

We remain committed to transparency about coordinated influence activity on Bluesky and to working with the broader research and policy community to better understand and counter these threats across decentralized platforms.

How Detection and Reporting Work Together

Proactive detection and user reports serve complementary roles. Human-in-the-Loop and automated systems excel at scale-based detection: identifying spam networks, matching against CSAM databases, and catching bot behavior patterns. User reports surface context-dependent harms like harassment campaigns, misleading content requiring cultural context, and emerging violation patterns our systems haven't yet learned to detect. Together, these approaches create multiple layers of protection rather than relying on any single detection method. Flagged content then moves through our enforcement pipeline, where it may receive a label, be taken down, or require no action. The Labels and Takedowns sections detail these outcomes.

Labels

Total Labels Applied

In 2025, Bluesky applied 16.49M labels across both accounts and individual pieces of content — a 200% increase from 2024's 5.5M labels. At the same time, account takedowns grew just 104% (from 1.02M to 2.08M). This divergence tells a story: we're preserving more content on the app with appropriate warnings rather than removing it entirely.

Labels serve a different purpose than takedowns. Where takedowns remove content, labels give users control. Each label lets people configure their own preferences — choosing to hide, warn, or show content based on their comfort level. As our systems mature, we maintain more diverse expression while reducing unwanted exposure.

Labels as User Infrastructure

  • User Choice at Scale: Users configure what they see rather than only accepting app-wide decisions. Adult content stays accessible to those who want it; those who don't can hide it entirely.
  • Compliant Access: In jurisdictions requiring age assurance, labels let us restrict minor access to adult content while keeping it available to adults.
  • Granular Control: Different labels for different harms mean users can fine-tune their experience — maybe they're comfortable with nudity in art but not explicit pornography, or want to see robust debate but not slurs.

How Automated Labeling Works

The majority of labels applied (95.34%) come from automated systems. Posts can have a video, or up to four images attached. There were 235M posts with images made in 2025, though profile photos and banners also need to be scanned on accounts. Following standard industry practice for platforms with user-generated content, every image and video uploaded to Bluesky is sent to a third-party provider (Hive) for assessment. The provider returns verdicts that are matched to our label categories (e.g. adult, suggestive, nudity, graphic-media, and self-harm), which are applied automatically. Neither Bluesky nor the vendor retains these images or uses them for training generative AI systems. As an open, decentralized network, all content on Bluesky is publicly accessible and indexable by design.

Top Labels

Label Category Total Applied % Applied Manually
adult 10.79M 1%
suggestive 3.42M 3%
spam 761.77K 1%
nudity 389.44K 18%
needs-review 370.14K 0%
rude 295.38K 100%
sexual-figurative 152.58K 0%
graphic-media 127.46K 3%
!hide 68.26K 2%
intolerant 60.20K 100%
self-harm 20.92K 8%
threat 15.86K 100%

Labels reflect our enforcement philosophy: address the harm, not the person. In 96.57% of cases, we label specific content rather than marking entire accounts. Categories like intolerance (98% at the content-level and 2% account-level), threat (100% content-level), and rude (99% content-level) target individual posts because one problematic post doesn't define an entire account. Account-level labels serve two purposes: enforcing against bad actors (impersonation, spam networks) and managing escalation when moderators identify patterns of repeated violations. Our operational label (needs-review) works exclusively at the account level by design.

Where Human Judgment Matters

While automation handles high-confidence signals (14.2M labels for adult content detection alone) human moderators concentrate on assessments requiring cultural competency and contextual understanding.

  • Rude (295.38K labels): Our largest human moderation workload. Applied when content is disrespectful without constructive purpose but doesn't violate harassment policies. Nearly all labels target individual posts; when moderators identify repeated patterns, they may escalate to account-level labels. Rudeness requires understanding context, intent, and cultural norms — exactly the kind of nuanced judgment that benefits from human expertise.
  • Intolerance (60.20K labels): Distinguishing hate speech from vigorous criticism requires understanding cultural context, language nuance, and intent. Applied almost entirely manually to specific content rather than entire accounts.
  • Threat (15.86K labels): Determining whether content represents genuine danger vs. hyperbole means understanding relationship context and evaluating credibility. Every threat assessment receives human review.

Labels for Specialized Workflows

  • needs-review: Applied automatically to combat large-scale spam attacks and bulk abuse patterns (370.14K accounts in 2025). Acts as a temporary holding measure that minimizes user impact while giving moderators time for assessment and enforcement decisions.
  • !hide: Removes content from public view but preserves it for internal verification (68.26K applications). Used primarily for toxic lists, we apply the label automatically upon receiving a report when we detect toxic terms in the list name and description. This allows the user to fix their list name and description without deleting their content.

Takedowns

Total Takedowns

Takedowns represent our final enforcement mechanism, reserved for violations that can't be addressed through labels. In 2025, we removed 2,446,334 items, covering both accounts and content.

Our approach balances precision with decisive action: removing individual posts when possible, removing accounts when necessary, and concentrating automation on bad-faith actors while human moderators handle context-dependent violations.

Takedowns by Policy

The centre pie chart shows a breakdown by the four foundation principles, and the outer chart shows more granularity. Due to the reporting criteria change in late-2025, we will have more precision in 2026 once we have a full calendar year with the new reporting criteria.

Takedowns by Policy - pie chart showing breakdown by four foundational principles

Accounts vs. Content

Our enforcement philosophy targets harm, not people. When violations involve specific content, we remove the problematic material while preserving the account. When violations involve bad-faith actors — impersonators, spam networks, coordinated manipulation — account-level action becomes necessary. Our strike system allows for up to 4 critical violations before an account is permanently suspended.

This distinction appears clearly in the data: Respectful Discourse and Intellectual Property violations both resulted in 64% content-level takedowns, removing harmful content while preserving accounts. By contrast, our "Be Authentic" policies — Account Authenticity, Trust & Transparency, and Repeated Violations/Ban Evasion — show the opposite pattern, with nearly all takedowns at the account level. These policies target people operating in bad faith: impersonators pretending to be someone else, spam operations, and ban evaders circumventing prior enforcement. For these violations, account removal is the appropriate response.

Labels vs. Takedowns

This philosophy of user control extends across categories. Adult content generated 14.2M labels compared to less than 2K takedowns, since consensual adult content is allowed on Bluesky with proper labeling. Respectful Discourse violations produced 295K "rude" labels but 32K takedowns (a 9:1 ratio), letting users decide whether they want to see impolite content while removing serious harassment and threats.

Graduated Enforcement

Not all violations warrant the same consequences. While most takedowns are permanent, our strike system (detailed in the Enhanced Moderation Tools section) tracks patterns and escalates responses from warnings to temporary suspensions to permanent removal.

This graduated approach means accounts may receive multiple reports before permanent removal. Moderators track whether reports represent distinct violations accumulating toward suspension, or whether they stem from a single incident reported multiple times or coordinated reporting campaigns. Genuine repeat violators progress through the strike system, while accounts reported multiple times for reasons other than policy violations don't accumulate strikes.

Nearly all temporary suspensions involve human review and contextual judgment. We issued 13,192 temporary suspensions in 2025, with 69% lasting 24 hours, 29% lasting 72 hours, and just 2% extending to seven days. These brief cooling-off periods give users clear signals to change behavior, reflecting our belief that people sometimes say things behind screens they wouldn't say in person. For discourse violations where behavior can improve, graduated enforcement provides room to learn. However, serious violations like death threats, credible harm, and sustained harassment campaigns lead to immediate consequences regardless of prior account history.

By contrast, automated takedowns are nearly always permanent, reflecting our approach of only automating enforcement when we have high-confidence signals about clear-cut violations.

Zero Tolerance Categories

Certain categories require immediate and permanent removal. Coordinated inauthentic behavior, spam networks, and impersonation made up 81% of policy-attributed manual enforcement across three "Be Authentic" policies: Trust & Transparency (782,047 takedowns), Account Authenticity (48,469), and Site Security (33,518). These aren't users who made mistakes; they're organized attacks on site integrity operating in deliberate bad faith.

Ban evasion operates under the same zero-tolerance principle. When users removed for policy violations return on new accounts, they face immediate re-removal, ensuring consequences have meaning. We issued 14,659 permanent removals for ban evasion in 2025.

Child Safety violations led to 30,236 takedowns, with 98.9% requiring human review. More details on our child safety operations, NCMEC reporting, and hash-matching systems appear in the dedicated Child Safety section below. For these categories, decisive enforcement protects the foundation of trust that enables healthy conversation.

Enforcement Priorities and Automation Restraint

The "Be Authentic" policies that comprised most takedowns (81%) mirror patterns in user reporting, where such Misleading content consistently ranks as the top concern. This alignment reflects a shared understanding between our enforcement systems and community experience: coordinated inauthenticity — spam, impersonation, and manipulation — disrupts healthy conversation.

We concentrate automation on detecting bad-faith actors rather than attempting to adjudicate speech. The "Be Authentic" policies leverage automated detection for coordinated spam networks, bot operations, and ban evasion patterns, where technical signals or behavioral patterns reliably identify inauthentic behavior. Most other policy categories rely primarily on human review, because context-dependent violations — harassment, hate speech, intellectual property, mental health content, public safety concerns — require cultural competency, legal expertise, and nuanced judgment that automation cannot provide. This approach reflects our commitment to addressing harm without overreach, removing bad actors decisively while giving good-faith users opportunities to change behavior.

Appeals

Less than 1% of our user base (267,509 users) submitted at least one appeal in 2025. This growth reflects both site expansion and our increased use of labeling systems, which create more opportunities for users to contest moderation decisions. Given that we applied 16.49M labels and issued 2.45M takedowns, the appeal rate demonstrates that most enforcement decisions stand without challenge.

The majority of appeals challenge automated labels applied to posts. Users can also appeal account suspensions directly in the app, though post takedowns currently require email appeals since removed content is no longer visible in-app.

Every appeal receives human review. Trained moderators assess edge cases, context-dependent situations, and potential false positives to ensure fair outcomes. While automation helps us scale detection and enforcement, human judgment remains essential for handling the nuance and complexity that contested decisions demand.

We're building improved in-app communication tools that will let users see outcomes of both their reports and appeals more transparently, part of our broader work making moderation decisions more visible and understandable to the community.


PART III: LEGAL & SAFETY

Legal Requests

In 2025, we received 1,470 legal requests from law enforcement agencies, government regulators, and legal representatives — a significant increase from 2024's 238 requests. This increase reflects both our continued user growth and expanded international awareness of Bluesky as a place for public conversation.

Request Overview

Of the 1,470 total requests we received, we complied with 1,334 valid requests, representing a 90.7% compliance rate. The remaining 130 requests were invalid, meaning we couldn't comply due to factors such as no records found, insufficient information provided, duplicate requests, or requests that didn't meet legal requirements.

Type of Request Total Valid Invalid
Records Requests 968 906 56
Preservation Requests 131 121 10
Takedown Requests 196 132 64
Inquiries 175 175 0
TOTAL 1,470 1,334 130

Records Requests are requests for user data, including subpoenas, search warrants, court orders, and emergency data requests. Preservation Requests ask us to preserve user data pending legal authorization to transfer it. Takedown Requests are regulatory content removal requests under various jurisdictions, including the EU Digital Services Act, UK Online Safety Act, and other country-specific regulations. Inquiries are general questions about our processes or procedures that don't result in data production.

Geographic Distribution

The majority of requests came from Germany, the United States, and Japan, with additional requests from authorities in the UK, France, Brazil, South Korea, Australia, and the Netherlands.

Processing and Response

Our high compliance rate demonstrates our commitment to working cooperatively with legitimate legal processes while protecting user privacy. We comply with all valid requests that meet legal requirements and include sufficient information for us to locate the requested data. When we receive invalid requests, we work with the requesting party to clarify what information is needed or explain why we cannot fulfill the request.

Copyright & Trademark

In 2025, we received 1,546 copyright and trademark cases, a 65% increase from 2024's 937 cases. This growth reflects our expanding user base and increased monitoring activity from professional copyright companies on the app.

We use a structured copyright reporting form that allows rights holders to submit detailed claims efficiently. We respond promptly to valid legal requests by removing or disabling access to infringing material when appropriate.

In September 2025, we updated our Copyright Policy to streamline our takedown procedures and strengthen protections against abusive reporting. The updated policy ensures compliance with the US Digital Millennium Copyright Act (DMCA), EU Digital Services Act, and similar laws globally, while clarifying our transparency reporting obligations.

Child Safety

Child safety requires specialized systems, dedicated expertise, and unwavering commitment. In 2025, we strengthened our child safety infrastructure through expanded policies, improved user reporting pathways, and continued use of automated detection systems that remove illegal content before moderators are exposed to it. These efforts complement our age assurance work (detailed earlier in this report) and reflect our broader commitment to protecting children — both those using our site and victims of child sexual exploitation and abuse worldwide.

Enhanced Policy and Reporting

While our previous Community Guidelines prohibited sexualizing or exploiting minors with a zero-tolerance stance, we expanded this framework in 2025 with a comprehensive standalone Child Safety policy. This policy explicitly addresses the full range of harms children face in online spaces: exploitative content, predatory conduct, privacy violations, and inappropriate commercial practices. It reflects both our values and our commitment to meeting international regulatory standards with clear, enforceable rules.

We also gave users more intuitive ways to report child safety concerns with greater specificity. Previously, reports went through a broad "Illegal and Urgent" category that captured many unrelated issues. Now, users select from dedicated options: Child Sexual Abuse Material, Grooming or Predatory Behavior, Youth Privacy Violation, Youth Harassment or Bullying, and "Other Child Safety" for concerns that don't fit existing categories. These categories comply with legal requirements and help us appropriately handle high-priority reporting categories. We route child safety-reports with high priority for review by moderators and any matches to existing hashes are immediately actioned by the team.

Detection and Enforcement

We continued using hash-matching systems to identify known cases of child sexual abuse material. When content matches these systems, it's immediately diverted — removed from public access and our infrastructure before human moderators or users view it. Only after diversion does our specialized child safety team review the material to prepare reports for authorities. This helps reduce harmful exposure to our users as well as moderators who aren't trained on those queues.

In 2025, our automated systems flagged and diverted 12,647 pieces of content for specialist review. Following human assessment, we submitted 5,238 reports to the National Center for Missing and Exploited Children (NCMEC), each containing account details and confirmed illegal material. Our detection capabilities are supported by Thorn's Safer tool, which provides industry-standard hash databases and additional detection methods. In addition, our child safety experts investigate any known links between users who are suspended for sharing violative child safety related content to disrupt networks before they form on Bluesky.

In 2025, we removed 6,502 posts for child safety violations out of 1.41 billion posts created on Bluesky — less than 0.001% of all content. This includes not just child sexual exploitation and abuse material but also other harms like minor privacy violations and underage commercial targeting. While statistically small among the billions of pieces of content created on Bluesky, we treat every case with the seriousness it demands.

Scale and Reality

As Bluesky grew and opened to the public, we attracted not just genuine community members but also bad actors. Child exploitation exists across the internet, and no app can eliminate it entirely. What matters is how quickly we detect it, how thoroughly we remove it, and how effectively we report it to authorities. Our approach combines proactive automated systems with human expertise, reducing moderator exposure to harmful content while maintaining the judgment and context required for accurate enforcement and reporting.


PART IV: CONCLUSION

Our Trust & Safety roadmap for 2026 focuses on three areas: building out core safety features, enhancing user experience, and investing in the broader ecosystem. Building out core safety features will cover projects such as better communicating with users about their reports and violations. Enhancing the user experience will focus on projects such as both improving the accuracy of our automated adult labels, and incentivising accounts to use self-labels on their own content. Investigating in the broader ecosystem will revisit labels, and third party moderation services - both on Bluesky and across the AT Proto ecosystem to see how we can put more power in the hands of communities to moderate speech.

2025 demonstrated that it's possible to build healthier social media at scale. We've shown that proactive systems can reduce toxic content, that age assurance can work across multiple jurisdictions while respecting privacy, and that moderation can be both consistent and context-aware.

Building this infrastructure required difficult trade-offs and substantial resources. We didn't get everything right, and we know there's a lot more work ahead. But the foundation we've built this year—the systems, the processes, the operational maturity—positions us to handle whatever comes next while staying true to our mission of empowering people instead of exploiting them.

As always, we're listening. Your feedback shapes our work, and we're committed to building this app together with the community it serves.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories