Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
153023 stories
·
33 followers

Hit Subscribe Digest

1 Share

Welcome to Hit Subscribe’s Monthly Digest! In this edition, we’re excited to share a collection of recent blog posts we’ve written for our clients. Plus, stick around till the end—we’ve included a meme of the month to keep things fun!

MCP Server Security: Everything You Need to Know 

If you’ve been using LLMs lately, you’ve probably seen how easy it is to connect them to your stack—cloud tools, ticketing systems, and internal APIs—with just a bit of setup.The issue is that this also makes it easy to accidentally expose sensitive systems. We’re already seeing failures like OpenClaw-style incidents where overly permissive agent connections led to serious data loss, including things like wiped email inboxes.

AI is a double-edged sword: it can dramatically improve workflows or just as easily scale mistakes when agents start calling tools without clear boundaries.

In this post, I’ll break down MCP Server Security, why it matters, and how to run MCP servers more safely in real-world environments.

MCP Server Vs. Client: How the Roles Compare

AI is getting better at writing code and analyzing systems, but without a structured way to connect it to your tools, integrations can become fragile, insecure, and hard to scale.

The Model Context Protocol (MCP) solves this by providing a standardized bridge between AI and external systems. At the core of this architecture are two components: the MCP Server vs. Client.

This article explains how MCP servers and clients work together to enable secure, efficient AI workflows—and how Tricentis uses MCP to make integrations more reliable at enterprise scale.

How to Connect to an MCP Server: A Practical Guide 

If you want AI to speed up your testing without causing mistakes or breaking things, connecting to an MCP server is the way to go.

This guide shows how to safely link AI clients to your tools, manage context, and get work done—like creating test cases, updating test plans, and organizing test data—without risking errors or lost information.

Hit Subscribe page breakAgentic Quality Assurance: A Guide to AI-Driven QA 

Quality assurance is changing. As software systems become more complicated and companies release updates faster, the old ways of testing software aren’t keeping up.

Agentic AI is helping quality assurance keep pace by allowing testing systems to make smart decisions on their own instead of relying only on fixed instructions.

In this guide, you’ll learn what agentic quality assurance means, why it is important, best practices, and how teams are using it in real-world situations.

Agentic Test Management: A Practical Guide 

Test management becomes harder as teams scale. Backlogs grow, release cycles speed up, and test data becomes scattered across tools.Test managers often spend more time triaging and prioritizing than focusing on quality outcomes, and manual decision-making starts to break down at scale.

Agentic AI offers a new approach. Instead of relying on humans for every decision, agentic test management lets AI agents analyze context, adapt test plans, and act in real time.

This guide explains what agentic test management is, how it differs from traditional QA, how it works, and how to apply it to your workflows.

Agentic Performance Testing: A Practical Guide

Platforms like Replit, Lovable, and Emergent are making it easier for vibe coders to build and debug code, and the output is increasingly moving beyond the “vibe test.” This is where agentic testing comes in.

Enterprise adoption is also accelerating. A 2025 KPMG survey found that 65% of companies over $1B in revenue have moved from AI agent experimentation into active pilots.

Anthropic’s research suggests agents will evolve from handling short tasks to autonomously building and testing full systems with only periodic human oversight.

This guide explores agentic performance testing, key use cases, and how it works in practice.

What Is an MCP Server? The Complete Guide 

MCP servers are the backbone of the Model Context Protocol, acting as the bridge between AI systems and the tools, data, and services they need to interact with. They make it possible for LLMs to securely call APIs, run actions, and access context in a structured, standardized way—without custom integrations for every tool. 

In this guide, we break down what an MCP server is, how it fits into the Model Context Protocol, and why it’s becoming a key building block for connecting AI systems to tools and data in a secure, structured way. 

Cross Browser Testing: A Complete Introductory Guide 

Web applications today cater to a variety of platforms, browser engines, and devices. Users across the globe may use any combination of browser, device, and platform to access an application.

Cross-browser testing helps to create a consistent behavior across different browsers on different platforms and devices for an application.

In this post, we’ll learn what cross-browser testing is, why it’s important, how to perform it, how to navigate any challenges one might run across, and explore some of the best practices.

Ready to take the guesswork out of SEO? Check out OsirisAutomated Acceptance Testing: A Guide to Get Started 

There are so many types of automated software testing that learning about all of them and keeping them straight in your head is a challenge.

In this post, we’ll provide a guide on automated acceptance testing, which is essential if you want your applications to meet users’ requirements and stay that way.

Validation Testing: Everything Beginners Need to Know 

Software development often feels like building a bridge while the landscape keeps shifting. Even if the code is correct, it still needs to meet the right user needs.

That’s where validation testing comes in. It ensures the software we build actually solves the problems users care about, not just that it works technically.

While developers focus on making code run, quality teams focus on making sure it delivers real value.

In this post, we explore validation testing—what it is, why it matters, and how it helps ensure the software you build not only works, but actually meets user needs. 

Automated Usability Testing: What It Is and How It Works 

Digital transformation is changing how businesses interact with customers. Today, software must do more than work correctly—it needs to deliver a seamless, intuitive experience.

Think of functionality as the engine of an application, and usability as the steering wheel and dashboard that guide the user. If those don’t work, the product fails to deliver value, even if the engine runs perfectly.

That’s where usability testing comes in.

In this post, we’ll explore what automated usability testing is, why it matters, and how agentic AI can enhance your usability testing framework.

Automated Security Testing: Your Guide to Getting Started

If there’s one place you don’t want to rely on last-minute heroics, it’s security. Teams ship faster, attackers get more sophisticated, and the old “pen test right before release” approach no longer holds up.

Automated security testing builds guardrails directly into your delivery pipeline, giving you early and continuous feedback instead of last-minute surprises.

Done well, it helps you catch issues earlier, reduce production risk, and avoid turning every release into a security fire drill.

In this post, we explore automated security testing—what it is, why it matters in modern delivery pipelines, and how it helps teams catch issues earlier without slowing down releases. 

Automated Accessibility Testing: A Complete Guide for QA Teams 

Digital inclusion is essential in modern software. Over 1 billion people, about 17 percent of the global population, live with a disability, and digital barriers can limit access to key services like healthcare, jobs, and education.

Accessibility is often treated as a final QA step, which leads to delays and higher costs. Automated Accessibility Testing, especially when enhanced with AI, helps teams catch issues early and build more inclusive products.

This guide explores how to build a modern accessibility program with automation and AI, and why accessibility is about universal design, not just fixes.

Quality Engineering: From Testing to Continuous Quality

Software teams have always cared about quality. But the way they pursue it has changed dramatically.

What started as a final-stage safety net (a QA team that caught bugs before release) has evolved into something far more strategic. Today, quality engineering (QA) sits at the heart of how modern software organizations build, ship, and scale products.

This post will break down everything you need to know: what quality engineering actually is, how it differs from traditional testing, and how teams use automation, data, and modern practices to achieve continuous quality.

How to Use ChatGPT for Usability Testing

User testing helps teams ensure that their digital products meet user expectations and deliver a rich and intuitive experience. However, it’s also a very time-consuming process.

To streamline usability testing workflows, ChatGPT has emerged as a valuable assistant for generating test cases, crafting research questions, and also giving user feedback analysis.

In this guide, you’ll learn how to use ChatGPT for usability testing, as well as follow along with a step-by-step tutorial.

Hit Subscribe page break

What Is Self-Healing Test Automation? A Guide

Automated tests break often. A developer renames a button, a field moves, or a locator changes, and suddenly tests fail even though the app still works.

This leads to constant maintenance instead of new test coverage. Self-healing test automation solves this by automatically adapting to UI changes.

This post explains what self-healing test automation is, how it works, why teams use it, and how to start implementing it effectively.

Chain-of-Thought Prompting: A Guide With Examples

You ask an AI tool a complex question like, “Which test cases should I prioritize for this release?” and it returns an answer that sounds confident but with no logic that you can trace.

There’s no problem with how you chose to prompt the AI, but there can often be a problem with how the AI was asked to think. Chain-of-thought (CoT) prompting is a technique that fixes that.

This guide breaks down what CoT is, how it works, and where it shines, with real examples built for engineers and QA professionals who work with code and testing every day.

What Is an AI MCP Server? A Beginner’s Guide

Modern software delivery needs automation that can adapt as systems change, but many testing workflows are still fragmented and slow.

AI-powered Model Context Protocol (MCP) servers solve this by giving AI agents a standard way to connect and coordinate testing tools and data across the pipeline.

This post explains what AI MCP servers are, what they enable in testing, and best practices for adopting them.

What Is Automated Web App Testing? A Beginner’s Guide

Modern web applications are complex, handling authentication, payments, third-party integrations, and dynamic content across devices.

As applications grow, manual testing becomes impractical. Automated web app testing helps teams verify functionality across the full stack, not just the UI.

It goes beyond checking buttons, covering workflows, APIs, backend logic, and performance.

This post explains what automated web app testing is and how agentic AI is changing how teams approach it.

MCP Prompts: A Complete Introductory Guide

Agentic AI is transforming software testing by enabling systems that can plan, execute, and analyze workflows autonomously.

A 2026 survey of 500 executives at $100M+ companies shows strong adoption plans, but scaling remains a challenge due to inconsistent outputs. MCP prompts help QA and DevOps teams bring structure and repeatability to AI-driven workflows, making results more reliable and reusable.

This guide explains MCP prompts and how to apply them in testing, including test generation, defect analysis, and regression evaluation.

What Is Software Quality? A Beginner’s Guide

Software teams all want to ship high-quality products, but “quality” often means different things to different people, from bug-free code to performance, security, or user experience.

The challenge is that quality is difficult to manage without clear definitions and measurable standards. As systems grow and release cycles accelerate, relying on intuition alone can lead to defects reaching production and increased rework.

This guide explains what software quality means, how to measure it, and how modern teams maintain it at scale.

Simplify SEO with Osiris, a single source of truth

Change Impact Assessment: A Step-by-Step Guide

Every software team has seen it. A small code change gets merged, tests pass, the build is green, and everything looks fine.

Then days later, a critical workflow breaks in production with no obvious connection to the change.

Change impact assessment helps prevent these failures by identifying what a change might affect before it reaches users.

This post explains what change impact assessment is, why it matters, how to do it, and how agentic technology is improving the process.

Agentic Functional Testing: A Complete Guide

AI coding tools are accelerating development, but QA teams are struggling to keep up.

Traditional QA relies on writing and maintaining scripts across multiple frameworks, often duplicating effort across web, mobile, and native platforms. This leads to brittle tests, frequent failures, and high maintenance overhead.

In fast-moving Agile teams, this creates too much noise and slows down real defect detection.

In this post, we’ll explore why legacy QA approaches don’t scale and how agentic functional testing helps teams keep up.

What Is Agent Orchestration? A Practical Guide

Most teams struggle with AI because their tools don’t share context, leading to duplicated effort and missed signals.

Agent orchestration solves this by coordinating AI tools so they work together as a system. Gartner projects that by 2027, 80% of enterprises will use AI-augmented testing tools, up from 10% in 2022.

In this post, we’ll explain what agent orchestration is, how it works, and why it matters for software testing.

Top 10 Code Coverage Tools for Software Testing

You wrote the tests and everything passed, but the real question is whether your tests actually cover the code that matters.

Code coverage tools help answer that by showing how much of your codebase is exercised by tests, not just whether they pass. This is critical for avoiding missed edge cases that can lead to outages, security issues, or audit failures.

In this post, we break down the top code coverage tools, what they do well, where they fall short, and how to choose the right one for your stack.

What Are Agentic Workflows? Everything You Should Know

Traditional automation works well for repeatable tasks in stable environments, but modern software delivery has outpaced static scripts and fixed workflows.

Agentic workflows offer a more flexible approach by focusing on goals instead of hardcoded steps. They can decide what to do next, call the right tools, and adapt when things change.

Gartner, as cited by Slack, expects agentic AI to enable a goal-driven digital workforce that works alongside humans.

In this post, we’ll explain what agentic workflows are, how they work, and how to implement them in a reliable, production-ready way.

Using AI Driven Data Loss Protection for Insider Threats

It’s increasingly common for employees using generative AI tools to paste sensitive company data into chatbot prompts, often without realizing the risk.

This can include proprietary code, customer PII, and payment card information, sometimes even from unmanaged personal accounts outside corporate visibility.

The issue is rarely malicious intent. It’s usually convenience, like pasting a spreadsheet into ChatGPT to quickly summarize data. But it creates a new, often invisible data loss risk.

This is the problem AI-driven data loss prevention is designed to address.

Microsoft Entra ID vs Okta: Best Choice for Enterprise IT

Microsoft Entra ID and Okta are two leading identity and access management platforms, each taking a different approach to securing enterprise environments. Entra ID is tightly integrated with Microsoft’s ecosystem, while Okta is designed to work across a wide range of tools and cloud applications.

As companies scale their cloud and SaaS usage, choosing the right identity platform has become a key security and architecture decision.

In this post, we compare Microsoft Entra ID vs Okta and break down their strengths, trade-offs, and use cases.

Hit Subscribe page break

Cross-Platform Testing Explained: A Practical Guide For 2026

Your app may run perfectly on one device but break on another, frustrating users and risking lost trust and revenue. 

In today’s world of diverse platforms, users still expect a flawless experience, regardless of the device or platform they access your app on. That’s why cross-platform testing is no longer optional; it’s critical. 

In this guide, we’ll cover what cross-platform testing entails, why it matters, common challenges, and how AI is shaping the future of cross-platform testing. 

How to Reduce Test Automation Costs: 3 Practical Ways

Software companies constantly look for ways to reduce costs, and test automation is often where budgets quietly spiral out of control.

Many teams invest heavily in automation only to find that maintaining scripts consumes more time than shipping features, with regression testing taking up 40–50% of QA effort.

The good news is that most of this cost comes from predictable patterns with practical fixes.

In this guide, we’ll cover how to reduce test automation costs, including the biggest maintenance traps, what drives inefficiency, and how to improve ROI with better tools and practices.

What Is Agentic Testing? Everything You Need to Know

Quite often, we have seen QA engineers spend hours rewriting scripts because a code change switched the variable name to something else and caused tests to break. 

Everything works functionally, but the tests report a failure that now needs to be investigated and fixed! This is the reality most testing teams live in. And it’s exactly the problem agentic testing is designed to solve.

Let’s use this article to deep dive into the world of agentic testing.

Mobile Test Automation: A Guide to Getting Started

Mobile test automation has become essential as apps grow more complex and release cycles get faster. But traditional approaches often struggle with flaky tests, high maintenance, and the challenge of keeping coverage consistent across devices, platforms, and frequent UI changes.

In this guide, we’ll explore mobile test automation, how it works, why it matters, and the key strategies teams use to build reliable, scalable testing for modern mobile applications.

Bed Management: A Complete Guide to Help You Succeed

A recent study published in JAMA Internal Medicine shows that the number of skilled nursing facility beds decreased by 2.5% between 2019 and 2024 while operating capacity shrank by 5%.

As capacity tightens, effective bed management has taken center stage for facilities looking to optimize occupancy and protect revenue.

To understand how your facility can make the most of limited space, you need to have a clear understanding of what effective bed management looks like in practice, why it matters, and how to make it happen.

Low Census: What It Is and How to Improve It

Low census is one of the most persistent operational challenges in post-acute care, directly impacting staffing, revenue, and overall facility performance. When bed occupancy drops, even slightly, it can create ripple effects across admissions, scheduling, and financial stability.

In this post, we’ll explain what low census means, why it happens, and how skilled nursing facilities can address it to improve occupancy and strengthen long-term performance.

Meme Of The Month

Straight from our internal Slack channel—because memes are fun, and so are we.

Meme with a dog

That’s All, Folks!

Thanks for catching up with us and we’ll see you next month. In the meantime, feel free to reach out if you have any questions, want to share your thoughts, or want to talk shop!

Read the whole story
alvinashcraft
16 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

The latest AI news we announced in April 2026

1 Share
Here are Google’s latest AI updates from April 2026
Read the whole story
alvinashcraft
26 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Meet the Finalists: JetBrains x Codex Hackathon

1 Share

Put a capable coding model inside a developer’s primary workspace, and the IDE stops being a place where you write code. It becomes a place where you direct an agent, watch how it reasons, manage what it pays attention to, and decide when its output is worth shipping. That was the defining theme of the inaugural JetBrains x Codex Hackathon: across roughly 40 submissions over a single weekend, teams explored what it actually means to build with AI natively inside the IDE – not bolted on top of it. The six finalists came up with some of the most compelling answers.

🥇 First Place: hyperreasoningAditya Mangalampalli

Most coding agents call the model once and hope for the best. As Aditya puts it: “LLMs spend a lot of time thinking in circles.” Hyperreasoning replaces the single shot with something closer to a search: the system drafts several possible approaches to a task, then a learned controller decides which to expand, which to cut, and which to verify against tests. Compiler errors and failing tests feed back into how the controller weighs its options.

Inside the IDE, a tool window renders the search live, so you can watch which paths the controller explored before settling on one. The argument the project makes is that a smaller local model wrapped in this kind of verified search loop can hold its own against much larger frontier models at meaningfully lower cost — with the IDE serving as the place where reasoning becomes visible and directable, rather than a black box that returns code.

🥈 Second Place: ScopecreepBhavik Sheoran, Kenneth Ross, Roman Javadyan, Joon Im

Hardware bring-up is a tool-juggling exercise: schematic viewer in one window, vendor apps for the oscilloscope and power supply in others, a terminal talking to the device, a spreadsheet collecting results. Scopecreep collapses that into a single JetBrains tool window. Hand it a circuit schematic and an agent works through testing the board – picking signals worth measuring, capturing the readings, and producing a report.

The design choice worth noticing: when the agent decides a probe needs to be placed, the session pauses and shows the engineer exactly where to put it. The engineer places the probe physically and clicks Resume. It’s the right call for real instruments on a real bench – autonomous, where a computer can be trusted, human-in-the-loop, where the work touches the physical world.

🥉 Third Place: mesh-codeAyush Ojha, Coco Cao, Kush Ise, AL DRAM

Switch machines mid-task, and your coding agent starts over. mesh-code fixes that by giving agents shared memory of an in-progress project – what’s been tried, what’s been decided, what’s still pending – so a session that begins on one laptop can continue from another, with whichever agent happens to be available. Codex is one of the agents that can plug in.

Latent Signal – Periscope

Long agent sessions accumulate dead weight: tool outputs nobody needs anymore, dead ends, context that was useful ten turns ago and isn’t now. Periscope, built on Wes McKinney’s open-source agentsview, is a JetBrains plugin that shows what’s actually filling up an agent’s working memory turn by turn – and recommends what to do about it, whether that’s continuing, rewinding to a better branching point, compacting, forking, or handing off entirely. It works with Codex and most other coding agents, and everything stays local.

SecureLoop – Abhiram Sribhashyam, Rahul Marri, Peyton Li

Security incident response is still mostly copy-paste: stack trace into a chat window, repo context explained by hand, a fix written and committed in the hope it’s safe. SecureLoop turns that into a controlled loop inside JetBrains. When something breaks in production, the agent gathers the relevant code, the project’s security rules, and the state of its dependencies, then asks Codex for a structured diagnosis and a proposed fix. That fix runs through automated checks before any pull request opens.

The PR opens automatically. The merge does not. SecureLoop surfaces everything that informed the decision – the diff, the policy it bumped into, the test that proved the patch – inside the IDE for the developer to approve or reject. As the team put it: “Codex fully makes the PR ready for you, and it remains human-in-the-loop where you have to approve or deny.”

The team’s bigger thesis is a security-policy.md file that lives in the repo alongside README.md, spelling out a project’s specific rules for handling secrets, errors, and risky patterns. Coding agents read it before suggesting changes, so the question stops being “what’s a good fix?” and becomes “what’s an acceptable fix under this codebase’s rules?”

Pinpoint – Het Patel

Frontend feedback delivered through a chat window is unavoidably vague. “Move that element” or “change that color” leaves the agent guessing which element you actually mean. Pinpoint takes that piece of the ambiguity off the table: developers drop pins directly on a live page, attach a comment to each, and send the whole batch to the agent with precise on-page context attached. The agent now knows exactly which element you meant – even if it still has to figure out what change you want.

The project ships in two pieces: one for annotating web pages in a browser, and a desktop companion for marking up anything visible on screen – useful when the interface in question isn’t a web page.

What the finalists show

Looking across these six projects, a clear pattern emerges. Codex embedded in the IDE isn’t just a faster way to write code – it’s a reasoning layer you can watch think, a structured output engine you can direct, a participant in workflows that span hardware instruments, production alerts, shared session state, and context windows. And the IDE becomes the place where all of that comes together: visible, controllable, and version-controlled.

That’s the possibility these teams spent a weekend proving out, and it’s only the beginning.

View the full submission gallery.

Read the whole story
alvinashcraft
42 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Breaking the code: Multi-stage ‘code of conduct’ phishing campaign leads to AiTM token compromise

1 Share

Phishing campaigns continue to improve sophistication and refinement in blending social engineering, delivery and hosting infrastructure, and authentication abuse to remain effective against evolving security controls. A large-scale credential theft campaign observed by Microsoft Defender Research exemplifies this trend, using code of conduct-themed lures, a multi-step attack chain, and legitimate email services to distribute fully authenticated messages from attacker-controlled domains.

The campaign targeted tens of thousands of users, primarily in the United States, and directed them through several stages of CAPTCHA and intermediate staging pages designed to reinforce legitimacy while filtering out automated defenses. The lures in this campaign used polished, enterprise-style HTML templates with structured layouts and preemptive authenticity statements, making them appear more credible than typical phishing emails and increasing their plausibility as legitimate internal communications. Because the messages contained concerning accusations and repeated time-bound action prompts, the campaign created a sense of urgency and pressure to act.  

Email threat landscape

Q1 2026 trends and insights ›

The attack chain ultimately led to a legitimate sign-in experience that was part of an adversary‑in‑the‑middle (AiTM) phishing flow, which allowed the attackers to proxy the authentication session and capture authentication tokens that could provide immediate account access. Unlike traditional credential harvesting, AiTM attacks intercept authentication traffic in real time, bypassing non-phishing-resistant multifactor authentication (MFA).

In this blog, we’re sharing our analysis of this campaign’s lures, infrastructure, and techniques. Organizations can defend against financial fraud initiated through phishing emails by educating users about phishing lures, investing in advanced anti-phishing solutions like Microsoft Defender for Office 365 and configuring essential email security settings, and encouraging users to employ web browsers that support SmartScreen. Organizations can also enable network protection, which lets Windows use SmartScreen as a host-based web proxy.

Multi-step social engineering campaign leading to credential theft

Between April 14 and 16, 2026, the Microsoft Defender Research team observed a series of sophisticated phishing campaigns targeting more than 35,000 users across over 13,000 organizations in 26 countries, with majority of targets located in the United States (92%). The campaign did not focus on a single vertical but instead impacted a broad range of industries, most notably Healthcare & life sciences (19%), Financial services (18%), Professional services (11%), and Technology & software (11%). Messages were distributed in multiple distinct waves between 06:51 UTC on April 14 and 03:54 UTC on April 16. 

Bar graph showing volume of messages sent by hour between April 14 and 16, 2026
Figure 1. Timeline of campaign messages sent by hour
Pie charts showing the breakdown of campaign recipients by country and industry.
Figure 2. Campaign recipients by country and industry

Emails in this campaign posed as internal compliance or regulatory communications, using display names such as “Internal Regulatory COC”, “Workforce Communications”, and “Team Conduct Report”. Subject lines included “Internal case log issued under conduct policy” and “Reminder: employer opened a non-compliance case log”.

Message bodies claimed that a “code of conduct review” had been initiated, referenced organization-specific names embedded within the text, and instructed recipients to “open the personalized attachment” to review case materials. At the top of each message, a notice stated that the message had been “issued through an authorized internal channel” and that links and attachments had been “reviewed and approved for secure access”, reinforcing the email’s purported legitimacy. To further support the confidentiality of the supposed review, the end of each message contained a green banner stating that the contents had been encrypted using Paubox, a legitimate service associated with HIPAA-compliant communications.

Screenshot of sample phishing email
Figure 3. Sample phishing email

Analysis of the sending infrastructure indicated that the campaign emails were sent using a legitime email delivery service, likely originating from a cloud-hosted Windows virtual machine. The messages were sent from multiple sender addresses using domains that are likely attacker-controlled.

Each campaign email included a PDF attachment with filenames such as Awareness Case Log File – Tuesday 14th, April 2026.pdf and Disciplinary Action – Employee Device Handling Case.pdf. The attachment provided additional context about the supposed conduct review, including a summary of the review process and instructions for accessing supporting documentation. Recipients were directed to click a “Review Case Materials” link within the PDF, which initiated the credential harvesting flow.

Screenshot of PDF attachment used in the campaign
Figure 4. PDF attachment

When clicked, users were initially directed to one of two attacker-controlled domains (for example, acceptable-use-policy-calendly[.]de or compliance-protectionoutlook[.]de). These landing pages displayed a Cloudflare CAPTCHA, presented as a mechanism to validate that the user was coming “from a valid session”. This CAPTCHA likely served as a gating mechanism to impede automated analysis and sandbox detonation. 

Screenshot of captcha challenge.
Figure 5. CAPTCHA challenge

After completing the CAPTCHA, users were redirected to an intermediate site designed to prepare them for the final stage of the attack. This page informed users that the requested documentation was encrypted and required account authentication. While this stage of the attack has several hallmarks of device code phishing, we were only able to confirm the AITM portion of the attack chain.

Screenshot of intermediate site asking users to click review & sign button
Figure 6. Intermediate site asking users to click “Review & Sign”

After clicking the provided “Review & Sign” button, users were presented with a sign-in prompt requesting their email address.

Screenshot of prompt directing users to enter email address
Figure 7. Prompt directing users to enter their email address

After submission, users were required to complete a second CAPTCHA involving image selection.

Screenshot of second captcha challenge
Figure 8. Second CAPTCHA challenge

Once these steps were completed, users were shown a message indicating that verification was successful and that their “case” was being prepared.

Screenshot of message telling users that verification completed successfully
Figure 9. Message telling users that “Verification completed successfully”

Following these steps, users were redirected to a third site hosting the final stage of the attack. Analysis of the underlying code indicates that the final destination varied depending on whether the user accessed the workflow from a mobile device or a desktop system.

Screenshot of code used to redirect users based on platform, whether mobile or dekstop
Figure 10. Code used to redirect users based on platform

On the final page, users were informed that all materials related to their code of conduct review had been “securely logged”, “time-stamped”, and “maintained within the organization’s centralized compliance tracking system”. They were then prompted to schedule a time to discuss the case, which required signing in to their account.

screenshot of final page instructing users to sign in
Figure 11. Final page instructed users to sign in

Selecting the “Sign in with Microsoft” option redirected users to a Microsoft authentication page, initiating an AiTM session hijacking flow designed to capture authentication tokens and compromise user accounts.

Mitigation and protection guidance

Microsoft recommends the following mitigations to reduce the impact of this threat. Check the recommendations card for the deployment status of monitored mitigations.

  • Review the recommended settings for Exchange Online Protection and Microsoft Defender for Office 365 to ensure your organization has established essential defenses and knows how to monitor and respond to threat activity.
  • Invest in user awareness training and phishing simulations. Attack simulation training in Microsoft Defender for Office 365, which also includes simulating phishing messages in Microsoft Teams, is one approach to running realistic attack scenarios in your organization.
  • Enable Zero-hour auto purge (ZAP) in Defender for Office 365 to quarantine sent mail in response to newly acquired threat intelligence and retroactively neutralize malicious phishing, spam, or malware messages that have already been delivered to mailboxes.
  • Responders could also manually check for and purge unwanted emails containing URLs and/or Subject fields that are similar, but not identical, to those of known bad messages. Investigate malicious email that was delivered in Microsoft 365 and use Threat Explorer to find and delete phishing emails.
  • Turn on Safe Links and Safe Attachments in Microsoft Defender for Office 365.
  • Enable network protection in Microsoft Defender for Endpoint.
  • Encourage users to use Microsoft Edge and other web browsers that support Microsoft Defender SmartScreen, which identifies and blocks malicious websites, including phishing sites, scam sites, and sites that host malware.
  • Enable password-less authentication methods (for example, Windows Hello, FIDO keys, or Microsoft Authenticator) for accounts that support password-less. For accounts that still require passwords, use authenticator apps like Microsoft Authenticator for multifactor authentication (MFA). Refer to this article for the different authentication methods and features.
  • Configure automatic attack disruption in Microsoft Defender XDR. Automatic attack disruption is designed to contain attacks in progress, limit the impact on an organization’s assets, and provide more time for security teams to remediate the attack fully.

Microsoft Defender detections

Microsoft Defender customers can refer to the list of applicable detections below. Microsoft Defender coordinates detection, prevention, investigation, and response across endpoints, identities, email, apps to provide integrated protection against attacks like the threat discussed in this blog.

Tactic Observed activity Microsoft Defender coverage 
Initial accessPhishing emailsMicrosoft Defender for Office 365
– A potentially malicious URL click was detected
– A user clicked through to a potentially malicious URL
– Suspicious email sending patterns detected
– Email messages containing malicious URL removed after delivery
– Email messages removed after delivery
– Email reported by user as malware or phish
PersistenceThreat actors sign in with stolen valid entitiesMicrosoft Entra ID Protection
– Anomalous Token
– Unfamiliar sign-in properties
– Unfamiliar sign-in properties for session cookies  

Microsoft Defender for Cloud Apps
– Impossible travel activity

Microsoft Security Copilot

Microsoft Security Copilot is embedded in Microsoft Defender and provides security teams with AI-powered capabilities to summarize incidents, analyze files and scripts, summarize identities, use guided responses, and generate device summaries, hunting queries, and incident reports.

Customers can also deploy AI agents, including the following Microsoft Security Copilot agents, to perform security tasks efficiently:

Security Copilot is also available as a standalone experience where customers can perform specific security-related tasks, such as incident investigation, user analysis, and vulnerability impact assessment. In addition, Security Copilot offers developer scenarios that allow customers to build, test, publish, and integrate AI agents and plugins to meet unique security needs.

Threat intelligence reports

Microsoft Defender XDR customers can use the following threat analytics reports in the Defender portal (requires license for at least one Defender XDR product) to get the most up-to-date information about the threat actor, malicious activity, and techniques discussed in this blog. These reports provide the intelligence, protection information, and recommended actions to prevent, mitigate, or respond to associated threats found in customer environments.

Microsoft Security Copilot customers can also use the Microsoft Security Copilot integration in Microsoft Defender Threat Intelligence, either in the Security Copilot standalone portal or in the embedded experience in the Microsoft Defender portal to get more information about this threat actor.

Hunting queries

Microsoft Defender XDR customers can run the following advanced hunting queries to find related activity in their networks:

Campaign emails by sender address

The following query identifies emails associated with this campaign using a message’s sending email address.

EmailEvents
| where SenderMailFromAddress in (" cocpostmaster@cocinternal.com "," nationaladmin@gadellinet.com ","
nationalintegrity@harteprn.com”,” m365premiumcommunications@cocinternal.com”,” documentviewer@na.businesshellosign.de”)

Indicators of compromise

IndicatorTypeDescriptionFirst seenLast seen
compliance-protectionoutlook[.]deDomainDomain hosting malicious campaign content2026-04-142026-04-16
acceptable-use-policy-calendly[.]deDomainDomain hosting malicious campaign content2026-04-142026-04-16
cocinternal[.]comDomainDomain hosting sender email address2026-04-142026-04-16
Gadellinet[.]comDomainDomain hosting sender email address2026-04-142026-04-16
Harteprn[.]comDomainDomain hosting sender email address2026-04-142026-04-16
Cocpostmaster[@]cocinternal.comEmail addressEmail address used to send campaign emails2026-04-142026-04-16
Nationaladmin[@]gadellinet.comEmail addressEmail address used to send campaign emails2026-04-142026-04-16
Nationalintegrity[@]harteprn.comEmail addressEmail address used to send campaign emails2026-04-142026-04-16
M365premiumcommunications[@]cocinternal.comEmail addressEmail address used to send campaign emails2026-04-142026-04-16
Documentviewer[@]na.businesshellosign.deEmail addressEmail address used to send campaign emails2026-04-142026-04-16
Awareness Case Log File – Monday 13th, April 2026.pdfFilenameName of PDF attachment containing phishing link2026-04-142026-04-14
Awareness Case Log File – Tuesday 14th, April 2026.pdfFilenameName of PDF attachment containing phishing link2026-04-152026-04-15
Awareness Case Log File – Wednesday 15th, April 2026.pdfFilenameName of PDF attachment containing phishing link2026-04-162026-04-16
5DB1ECBBB2C90C51D81BDA138D4300B90EA5EB2885CCE1BD921D692214AECBC6SHA-256File hash of campaign PDF attachment2026-04-14  2026-04-16  
B5A3346082AC566B4494E6175F1CD9873B64ABE6C902DB49BD4E8088876C9EADSHA-256File hash of campaign PDF attachment2026-04-142026-04-16
11420D6D693BF8B19195E6B98FEDD03B9BCBC770B6988BC64CB788BFABE1A49DSHA-256File hash of campaign PDF attachment2026-04-142026-04-16

Learn more

For the latest security research from the Microsoft Threat Intelligence community, check out the Microsoft Threat Intelligence Blog.

To get notified about new publications and to join discussions on social media, follow us on LinkedIn, X (formerly Twitter), and Bluesky.

To hear stories and insights from the Microsoft Threat Intelligence community about the ever-evolving threat landscape, listen to the Microsoft Threat Intelligence podcast.

The post Breaking the code: Multi-stage ‘code of conduct’ phishing campaign leads to AiTM token compromise appeared first on Microsoft Security Blog.

Read the whole story
alvinashcraft
50 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Register now for OpenClaw: After Hours @ GitHub

1 Share

OpenClaw, one of the fastest-growing open source projects, has already picked up over 350,000 stars and an early community of builders exploring what agentic systems can actually do in practice.

That’s why, on June 3, 2026, we are hosting OpenClaw: After Hours at GitHub HQ in San Francisco. The event will take place during Microsoft Build 2026.

This evening is a chance to bring the OpenClaw community together into the same room.

We’ll kick things off in the early evening with a fireside conversation featuring Peter Steinberger, the ClawFather and creator of OpenClaw, followed by a panel with OpenClaw maintainers and ecosystem builders sharing what’s working—and what’s not—when shipping real agentic systems.

Later in the evening, we’ll move into a series of fast-paced lightning talks and close things out with a relaxed happy hour to connect with other builders.

If you have been following the project or building with it yourself, this is a good chance to meet others, trade notes, and get your claws into what people are actually shipping.

👉 For the full agenda and speaker lineup, please see the registration page.

📍 GitHub HQ, 275 Brannan St., San Francisco
🗓 June 3, 5:30 p.m. – 9 p.m.
📺 Livestream: twitch.tv/github

Drinks and snacks will be provided. There will be a lot here to chew on. No shellfish behavior please. And bring your sharp ideas!

Spots are limited, so register early and come ready to share what you are working on.

‼️ Please note: Submitting a registration does not guarantee attendance. We’ll follow up to confirm successful registrations.

What is OpenClaw?

OpenClaw is an open source framework for building and running agentic systems, focused on giving developers real control over how agents execute tasks in the wild. It provides the core pieces for orchestrating tools, managing state, and handling long running workflows, so you can move beyond prompt demos and ship systems that actually do work. It’s also probably convinced more than a few people to buy a Mac Mini just to run “one small experiment” that somehow turned into a permanent setup.

Hear more about OpenClaw from the creator himself, Peter Steinberger: 

The post Register now for OpenClaw: After Hours @ GitHub appeared first on The GitHub Blog.

Read the whole story
alvinashcraft
56 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

We Gave Agents IDE-Native Search Tools. They Got Faster and Cheaper.

1 Share

We ran the same coding tasks with and without prebundled tooling, across multiple models and languages. Here’s what changed.

Eval-driven development

IDE-native search reduced latency, cost, and budget overruns.

The comparison below uses paired task-level deltas. Aggregate medians and totals are shown for orientation. Budget overruns are tasks that exceeded the USD 0.50 per-task cap.

8.33% Median latency reduced 83.11s → 79.03s
16.44% P95 latency reduced 268.71s → 213.17s
5.60% Total cost reduced USD 44.17 → USD 41.67
33.28% Budget overruns reduced 6.67% → 4.44%

Why We Built This

When coding agents search code, they default to shell tools. grep and find work, but they’re blind to project structure, symbol boundaries, and language semantics. The agent burns tokens sifting through noisy output and making follow-up calls to narrow things down.

So we tried something obvious: what if the agent could use the IDE’s own search instead?

We built a prebundled skill that pairs a search prompt with a unified MCP tool. One tool, four modes: file search, text search, regex, and symbol lookup. A universal router dispatches calls to the right backend.

MCP Tools

Functions the agent calls via an MCP server during task execution. IDE-native tools can tap into indices, ASTs, and project models that shell tools cannot see.

Skills

Packaged agent behaviors: a prompt plus orchestration logic. A skill can work on its own, use tools, or ship bundled with the tools it needs.

Nothing ships by default until the eval says it should. We tested four different configurations of this tooling before picking one.

Methodology

The eval pipeline spins up an MCP server alongside the IDE so the agent has access to the configured tools and skills. We run identical coding tasks with and without tooling, then compare with paired delta analysis.

We track four things: quality, latency, cost, and budget discipline. Quality asks whether all tests passed. Latency tracks median and P95 task time. Cost converts token consumption into dollars. Budget discipline tracks how often a single task exceeds the USD 0.50 budget cap.

We report improvement deltas only when they pass our significance threshold: p < 0.05, paired test with 95% confidence intervals. Metrics without a significant change are either omitted from the charts or called out explicitly. We tried four configuration variants, selected the one with the best latency and cost tradeoff, then re-ran it on different models and languages to check that the results held.

Eval frame

Same tasks, same grading, one controlled difference.

Quality All-tests-passed rate, checked before performance claims.
Latency Median and P95 task duration, compared with paired deltas.
Cost Token use converted to dollars across the task set.
Budget discipline Share of tasks exceeding the USD 0.50 single-task cap.

Results

The selected configuration was a prebundled search skill plus a unified IDE-native tool and universal router. Compared with the no-tooling baseline, it reduced latency and cost without producing a statistically significant quality change.

Baseline vs. tooling

Absolute metrics moved in the right direction.

Median latency
Baseline83.11s
With tooling79.03s
P95 latency
Baseline268.71s
With tooling213.17s
Total cost
BaselineUSD 44.17
With toolingUSD 41.67
Budget overruns
Baseline6.67%
With tooling4.44%
Budget overruns
33.28%
P95 latency
16.44%
Median latency
8.33%
Total cost
5.60%

No statistically significant change in quality. All shown deltas passed the significance threshold.

Trace snapshots

The difference is visible in the agent’s path through the project.

These are shortened traces from cases that improved in both time and cost. The baseline spends more steps discovering context; the prebundled setup gets to the relevant files faster.

Service comments and replies
prompt Update service and controller layers for comments and replies. before: no prebundled IDE search agent> list files -> search x2 -> list files x2 agent> jar inspect x5 -> javap -> jar inspect -> javap x5 agent> curl download -> decompile -> search -> find files x2 agent> read 9 files -> edit file x8 -> respond time: 472s after: prebundled skill and unified search agent> read SKILL.md -> search x3 -> read 5 files agent> read FeatureController.java -> read 4 files agent> edit file x2 -> respond time: 127s
Jackson key deserializer
prompt Preserve detailed error messages from a custom key deserializer. before: broad code walk agent> list files -> search x2 -> read README.md agent> search x5 -> read DeserializationContext.java agent> search x4 -> read StdDeserializer.java agent> search -> read DeserializerCache.java agent> read MapEntryDeserializer.java -> read JsonMappingException.java agent> edit file -> respond time: 150s after: targeted search agent> read SKILL.md -> search x3 agent> read MapDeserializer.java agent> read StdKeyDeserializer.java agent> read DeserializationContext.java agent> edit file -> respond time: 34s

Configuration Explorer

We tested four tool configurations before choosing the final shape. Lower latency and lower total cost are better, so the lower-left corner of the plot is the target.

Configuration search

The selected option had the best latency while preserving cost reduction.

Median latency, 78s to 84s Total cost, USD 39.50 to USD 45.00
Baseline 4 Search Tools Unified Search Tool 4 Tools + Router Unified Tool + Router

Cross-Model Validation

We re-ran the experiment with GPT 5.4 on Java and Kotlin codebases. The pattern holds: latency and cost both drop. Kotlin saw the biggest cost improvement, with total cost falling 13.48%.

Cross-model check

The effect held beyond the original run.

Codex 5.2

Median latency8.33%
Total cost5.60%
P95 latency16.44%

GPT 5.4, Java

Median latency3.75%
Total cost4.07%
P95 latency13.00%

GPT 5.4, Kotlin

Median latency6.92%
Total cost13.48%
P95 latencynot significant

Missing bars mean that metric was not statistically significant for that model and language.

How Models Adopt Tooling

Codex sends 91% of its search calls through the new IDE-native tool. Claude is a different story: Opus uses it for about half its searches, and Haiku only 28%, preferring grep and find instead.

This makes sense. Claude already has strong built-in code search, so it leans on what it knows. Codex doesn’t, so it grabs the better tool when one is available. The takeaway: prebundled tooling fills gaps. Where the model already has good search, it adds less. Where search is weak, it makes a real difference.

Tool adoption

Models do not use new tools at the same rate.

Codex
91 8 1
Claude Opus
53 28 19
Claude Haiku
28 33 39
IDE Search grep find

What’s Next

The eval pipeline works. Now we’re using it.

We’re running the same experiment on smaller models next. Our hunch is that they’ll benefit even more, since they have less built-in search capability to fall back on.

The current results are strongest on Java and Kotlin. We’re expanding to Python, .NET, and TypeScript with bigger sample sizes.

Meanwhile, the winning configuration is being prepared for the integrated IntelliJ IDEA MCP Server, so agent sessions can use IDE-native tooling when the server is enabled.

The next step is to turn this feature on by default in upcoming AI Assistant plugin updates.

Want to try it before the default rollout?

  1. Set these registry keys to true: llm.chat.agent.codex.mcp.idea, llm.chat.agent.skills.settings.enabled, and llm.agents.contrib.bundled.skills.sync.enabled.
  2. In AI Assistant, choose Codex for the best results.
  3. Ask the agent to find something across the current project.
Measure first, ship second, keep measuring after. That’s the whole approach.
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories