Microsoft is removing its Lens scanner app from iOS and Android in the coming months. Microsoft Lens, or Office Lens as it's known to most, will no longer be supported on February 9th, and the app won't be functional after March 9th.
The portable scanner features of Lens are available in OneDrive instead, making a dedicated app redundant for Microsoft. You'll still be able to capture pictures of whiteboards, documents, and receipts and save and edit them digitally in OneDrive. Pictures can then be converted into Word or PDF documents and properly cropped and rotated.
Microsoft first launched its Office Lens app for iOS and Android in 2015, ā¦
Microsoft Lens ā once known as Office Lens ā has been hanging on by a thread for some time now. The original plan had been to kill off the document scanning app last year, but it was granted a stay of execution. But now, just as we ease into 2026, Microsoft has announced a revised schedule for the retirement of the app. For fans of Microsoft Lens, there are mere weeks left until it vanishes. There are many reasons for Microsoft taking the decision to kill off Lens, not least of which is the fact that there are now so… [Continue Reading]
It was Monday morning again. Meetings filled the calendar. Ideas flowed quickly. Everyone agreed on what needed to be done. You took notes. You asked questions. You left the room feeling confident.
Later that day, you opened a blank document titled Functional Design Document. Suddenly, things no longer felt clear.
The decisions no longer felt solid. Doubts started to appear:
A requirement was mentioned but not written clearly.
An edge case came up, but you were not sure where it belonged.
A constraint sounded important, yet you could not find it in your notes.
You stared at the screen longer than expected. The blank page felt heavier with every passing minute. This document mattered more than it seemed. Teams would rely on it. Decisions would depend on it.
Something felt off. You sensed that something was missing and would cause problems later, but you didn't know what it was. This is where clarity is either created⦠or lost.
As a Solution Engineer at Mazik Global, working closely with functional consultants, I have seen this challenge many times. A simple question came to mind:
Could AI make this easier?
That question led us to experiment with Microsoft 365 Copilot at Mazik Global. What happened next changed everything about how we create design documentation.
What is Microsoft 365 Copilot?
Microsoft 365 Copilot is a conversational, AI-powered assistant that helps boost productivity and streamline workflows by offering contextual assistance, automating routine tasks, and analyzing data.
It acts like an intelligent partner, understanding context and suggesting what comes next. Across organizations, Microsoft 365 Copilot helps boost productivity and streamline workflows, making complex tasks faster and more efficient.
We didnāt just ask Copilot to āwrite a Functional Design Document (FDD).ā Instead, we broke the process into phases, just like we do in real projects, and explored how Copilot could assist at each step.
Workflow We Discovered:
Step 1: Brainstorming with Copilot Idea Coach Agent
The hardest part of any FDD is the beginning. Idea Coach helped us sketch a high-level structure in minutes. Instead of staring at a blank page, we started with prompts like:
āHelp me brainstorm an AI-based appointment cancellation chatbot for an ABC (e.g. patient) portal.ā
(M365 Copilot Idea Coach Agent)
Within 5ā10 minutes, we had a clear direction and workflow outline.
Step 2: Deep Research with Copilot Researcher Agent
The next phase involved gathering best practices, comparing different approaches, and validating ideas. Copilot Researcher provided detailed, up-to-date information with supporting citations and included advanced techniques relevant to the topic. Although the responses were comprehensive and lengthy, summarizing them was more efficient than starting the research process manually.
Prompt we give:
āGive me insights on Voice Interaction Integration for Appointment Cancellation.ā
(M365 Copilot Researcher Agent)
Step 3: Logic and Data Analysis with Copilot Analyst Agent
For data-driven features, we used Analyst to generate conditional logic and analyze trends.
Need an if/else block for appointment validation: The analyst had it ready.
Need quick insights from a graph: The analyst interpreted it in seconds.
Prompt we give:
āWrite me a pseudo code, if-else conditional statement/logic for how an appointment cancellation AI chatbot can work.ā
(M365 Copilot Analyst Agent)
Step 4: Drafting and Structuring with Copilot Notebook and Copilot Word
Once we had the pieces, Notebook helped organize related documents and maintain context.
Prompt we give:
āCan you help structure this better? Voice Interaction Integration for Appointment Cancellation.ā
(M365 Copilot Notebook)
Copilot Word then stepped in to reword, restructure, and format the FDD ā without endless copy-pasting.
(M365 Copilot in MS Word)
The result? A document that looked polished and professional, faster than ever.
What Changed for Us
Hereās what stood out from our experiment:
Speed: Idea Coach confirmed ideas in minutes, not hours.
Depth: The researcher provided detailed, valid insights with references.
Efficiency: Copilot Word handled formatting and restructuring effortlessly.
Collaboration: Notebook kept everything connected for easy querying.
From Internal Experiment to Industry Insight
Our journey began with a simple question:
"Could AI make documentation easier?"
What started as an internal experiment quickly revealed something biggerāAI isnāt just a tool; itās a catalyst for transforming how teams work. By testing Microsoft 365 Copilot in real-world scenarios, we uncovered practical ways to reduce friction, improve clarity, and accelerate delivery. These insights go beyond one organizationāthey point to a future where documentation is no longer a bottleneck but a strategic advantage.
Journey as an AI Frontier Firm
At Mazik Global, we are committed to establishing ourselves as an AI Frontier Firm, where every team memberānot just developersāfeels empowered to work confidently with AI.
AI isnāt just about coding faster. Itās about transforming how we collaborate, capture knowledge, and deliver impact.
Sometimes, the most powerful innovations begin with a simple question:
āWhat if AI could turn complexity into clarity?ā
The Future of Documentation
This experiment showed us something powerful:
"AI isnāt here to replace expertise; itās here to amplify it."
With tools like Microsoft 365 Copilot, documentation becomes less about grunt work and more about creativity and collaboration.
Next time you face a blank page, imagine starting with clarity instead of chaos. Thatās the promise of AI-powered documentation.
Conclusion
Documentation has always been a critical part of delivery, but it doesnāt have to be the most time-consuming. Our experiment with Microsoft 365 Copilot showed us that AI can turn complexity into clarity, helping teams move faster without sacrificing quality.
This isnāt just about writing documents; itās about rethinking how we work. When AI takes care of the repetitive tasks, teams can focus on what truly mattersāinnovation, collaboration, and delivering impact.
The future of documentation isnāt manual. Itās intelligent, adaptive, and collaborative. And it starts with a simple question:
We are kicking off 2026 with a major set of updates designed to streamline how you build, test, and deploy AI agents. This month, weāve focused on aligning with the latest GitHub Copilot standards, introducing powerful new debugging tools, and enhancing our support for enterprise-grade models via Microsoft Foundry.
š” From Copilot Instructions to Agent Skills
The biggest architectural shift following the latest VS Code Copilot standards, in v0.28.1 is the transition from Copilot Instructions to Copilot Skills. This transition has equipped GitHub Copilot specialized skills on developing AI agents using Microsoft Foundry and Agent Framework in a cost-efficient way.
In AI Toolkit, we have migrated our Copilot Tools from the Custom Instructions to Agent Skills. This change allows for a more capable integration within GitHub Copilot Chat.
š Enhanced AIAgentExpert: Our custom agent now has a deeper understanding of workflow code generation and evaluation planning/execution.
š§¹Automatic Migration: When you upgrade to v0.28.1, the toolkit will automatically clean up your old instructions to ensure a seamless transition to the new skills-based framework.
šļø Major Enhancements to Agent Development
Our v0.28.0 milestone release brought significant improvements to how agents are authored and authenticated.
š Anthropic & Entra Auth Support
Weāve expanded the Agent Builder and Playground to support Anthropic models using Entra Auth types. This provides enterprise developers with a more secure way to leverage Claude models within the Agent Framework while maintaining strict authentication standards.
š¢ Foundry-First Development
We are prioritizing the Microsoft Foundry ecosystem to provide a more robust development experience:
Foundry v2: Code generation for agents now defaults to Foundry v2. ā”
Eval Tool: You can now generate evaluation code directly within the toolkit to create and run evaluations in Microsoft Foundry. š
Model Catalog: Weāve optimized the Model Catalog to prioritize Foundry models and improved general loading performance. šļø
š» Performance and Local Models
For developers building on Windows, we continue to optimize the local model experience:
Profiling for Windows ML: Version 0.28.0 introduces profiling features for Windows ML-based local models, allowing you to monitor performance and resource utilization directly within VS Code.
Platform Optimization: To keep the interface clean, weāve removed the Windows AI API tab from the Model Catalog when running on Linux and macOS platforms.
š Squashing Bugs & Polishing the Experience
Codespaces Fix: Resolved a crash occurring when selecting images in the Playground while using GitHub Codespaces.
Resource Management: Fixed a delay where newly added models wouldn't immediately appear in the "My Resources" view.
Claude Compatibility: Fixed an issue where non-empty content was required for Claude models when used via the AI Toolkit in GitHub Copilot.
š Getting Started
Ready to experience the future of AI development? Here's how to get started:
We'd love to hear from you! Whether it's a feature request, bug report, or feedback on your experience, join the conversation and contribute directly on our GitHub repository.
In the previous installment, we saw how to set up tracing, which brings us to two new questions: What should we experiment with based on the information this tool provides? And what parts of our agent could we improve using its observations?
The first idea we had was to experiment with sub-agents, or more specifically, a find sub-agent. This will give us a chance to have a look at how Koog makes it easier to implement common patterns like sub-agents. Our hypothesis is that a find sub-agent might reduce overall cost while maintaining, or even improving, performance.
Why would we think that? Well, the main driver of cost is context growth. Each LLM request contains the full context from start to finish, which means each subsequent request is more expensive (at least in terms of input tokens) than the previous one. If we could limit context growth, especially early in the agent’s run, we might significantly reduce cost. An unnecessarily large context could also distract the agent from its core task. Therefore, by narrowing the context, we might even see a performance improvement, though thatās harder to predict.
The find functionality is particularly suited for removal from the long-term context. When searching for something, you typically open many files that don’t contain your target. Remembering those dead ends isn’t useful. Remembering what you actually found is. You could think of this as a natural way of compressing the agent’s history (we’ll look at actual compression in a later article).
This task is also a good candidate for a sub-agent because it’s relatively simple. That simplicity means we could also make use of the sub-agentās ability to use a different LLM model. In this case, a faster and cheaper one. This offers flexibility that regular compression doesn’t.
Of course, we could have built a traditional procedural tool to do this. In fact, we did build one called RegexSearchTool, but for the purposes of this experiment, we put it inside the find agent rather than directly in the main agent. This approach provides us with flexibility in terms of model choice while also incorporating an extra layer of intelligence.
The find agent
To be able to have a sub-agent pattern, we first need a second agent. We’ve already covered agent creation in depth in Part 1 of the series, so we won’t spend much time on this now. However, a few details are still worth noting.
First, a minor point: We’re using GPT4.1 Mini for this sub-agent because its task is much simpler and doesn’t require a model as capable as the one used by the main agent.
Second, it’s useful to look at which tools this agent can access. Like the main agent, it has access to the ListDirectoryTool and ReadFileTool, but not the EditFileTool or ExecuteShellCommandTool. Weāve also given it access to the new procedural search tool we mentioned above, RegexSearchTool, which allows us to search a comprehensive range of files inside a folder and its subfolders using a regex pattern.
For more detailed information, check out the full implementation here.
Building a find sub-agent
First things first ā what is a sub-agent? A sub-agent is really quite simple; it is an agent that is being controlled by another agent. In this specific case, we are working with the agent-as-a-tool sub-agent pattern, where the sub-agent is running inside a tool that is provided to the main agent.
Creating a sub-agent turns out to be straightforward. We know a tool is essentially a function paired with descriptors that the agent can read to understand when and how to call it. We could simply define a tool whose .execute() function calls our sub-agent. But Koog provides tools to remove even this boilerplate:
fun createFindAgentTool(): Tool<*, *> {
return AIAgentService
.fromAgent(findAgent as GraphAIAgent<String, String>)
.createAgentTool<String, String>(
agentName = "__find_in_codebase_agent__",
agentDescription = """
<when to call your agent>
""".trimIndent(),
inputDescription = """
<how to call your agent>
""".trimIndent()
)
}
You could think of this as roughly equivalent to:
public class FindAgentTool(): Tool<FindAgentTool.Args, FindAgentTool.Result>() {
override val name: String = "__find_in_codebase_agent__"
override val description: String = """
<when to call your agent>
"""
@Serializable
public data class Args(
@property: LLMDescription(
"""
<how to call your agent>
"""
)
val input: String
)
@Serializable
public data class Result(
val output: String
)
override suspend fun execute(args: Args): Result = when {
output = findAgent.run(args.input)
Result(output)
}
}
In either case, the only things we need to do are:
Create our sub-agent.
Give it an agentName.
Specify when to call the agent through the agentDescription prompt.
Specify how to call the agent through the inputDescription prompt.
The prompts are, perhaps, the trickiest part. Thereās plenty of room for fine-tuning. But thereās some indication that newer LLMs need less precisely tuned prompts, so perfectly fine-tuned prompts may not be worth our time. We’re still exploring this topic ourselves, and it will take more experimentation to really come to a strong conclusion.
One thing we did notice is that, if we’re not careful with the prompts, the main agent sometimes confuses the find agent with a simple Ctrl+F / āF function, sending only the tokens it wants to search for. Thatās clearly suboptimal. With so little context, the find agent canāt reason about what it should actually be looking for. To address this, we include instructions requiring the main agent to specify why it’s looking for something. That way, the find agent can fully leverage its intelligence to find the actual thing the main agent is looking for.
"""
This tool is powered by an intelligent micro agent that analyzes and understands code context to find specific elements in your codebase.
Unlike simple text search (Ctrl+F / āF), it intelligently interprets your query to locate classes, functions, variables, or files that best match your intent.
It requires a detailed query describing what to search for, why you need this information, and an absolute path defining the search scope.
...
"""
Query WITH highlighting (not Ctrl+F /āF)
Query WITHOUT highlighting (not Ctrl+F /āF)
Search for changes in get_search_results regarding unnecessary joins to see if there are comment or logic on unnecessary joins.
get_search_results
Search for environment variable usage with SKLEARN_ALLOW or similar in repository to find potential bypass of check_build.
SKLEARN_ALLOW
We also noticed that the main agent sometimes still chooses to call the shell tool with a grep command instead of the find agent, which undermines the entire purpose of having a dedicated sub-agent. To avoid this pattern, we added this section to the main system prompt in order to push it harder:
"""
...
You also have an intelligent find micro agent at your disposal,
which can help you find code components and other constructs more
cheaply than you can yourself. Lean on it for any and all
search operations. Do not use shell execution for find tasks.
...
"""
This is the “natural compression” we mentioned in the opening. The find agent opens many files, follows dead ends, and explores the codebase. But the main agent only sees the results: relevant file paths, snippets, and explanations. All that exploration stays in the find agent’s context and disappears after it returns. Only the stuff that really mattered is then added to the main agent’s context.
The trade-offs
Using a sub-agent has its benefits, but it also has downsides. This is certainly the kind of change that warrants experimentation to show whether it delivers the benefits we’re hoping for without too many sacrifices.
The first trade-off is cost and time. While shortening the context in the main thread helps bring down the cost and time there, we now also have to pay and wait for a number of LLM calls in the sub-agent. The hope is that the total cost and time spent are less, but that depends on how the main agent uses the sub-agent. If it ends up doing a large number of small queries, that benefit might not materialize. We will look at the costs when we run the benchmarks again in a later section, and we will just assume that cost and time are correlated.
We did notice this happening in some of our runs, so we added a segment to the toolās agentDescription that explains the issue to the main agent and tries to limit the frequency of such high volumes of small queries:
"""
...
While this agent is much more cost efficient at executing searches than using shell commands, it does lose context in between searches. So give preference to clustering similar searches in one call rather than doing multiple calls to this tool.
...
"""
A second trade-off is that this approach treats context retention in a far more black-and-white way than humans do. We may not pull everything that happened in the past into active memory, but we do keep vague impressions of what happened and can retrieve additional context when needed. There are ways to model this kind of behavior, but they are far beyond the scope of the current iteration of our agent and are more related to the deep and complex subject of agentic memory.
Another challenge is that it creates more complexity in tracing. In Langfuse, we no longer only have to look at the trace of just one agent. Indeed, we might even need to look at the behavior from multiple perspectives ā both the full view and each agent separately.
Think wider: The engineering team analogy
This technique of using sub-agents isn’t limited to simple cases like the find agent. You could, for example, replicate the separation of concerns in team structures by assigning analysis, implementation, testing, and planning to different sub-agents.
It’s still an open question whether an agent with all these capabilities does better or worse than a system where such capabilities are divided among sub-agents, but it’s not hard to imagine potential benefits. Think of Conway’s law: “Organizations design systems that mirror their communication structure.” One interpretation is that these communication structures evolved to discover efficient patterns worth keeping. The reverse Conway maneuver even suggests this is desirable.
Could the same be true for role distribution? Maybe the division of tasks across different specializations in software teams also evolved to discover efficient ways of working. Maybe LLMs could benefit from that, too.
Yet this is not guaranteed. The efficiencies might stem largely from spreading the human learning processes, and this might not apply to LLMs. But in the book Clean Code, we read about wearing different hats: a writer hat (creator), a reader hat (maintainer), and a tester hat (tester). The idea is to focus on one role without being distracted by the perspectives of the others. This suggests task division goes deeper than just learning efficiency, meaning it might indeed be relevant to LLMs.
All of this is to say that you can take sub-agents a lot further, but whether this is a beneficial approach is still unproven. It’s still an art form for now, not a hard science.
Benchmark results: Testing the hypothesis
We’re happy to report that our version without the find sub-agent shows a cost of about $814, or $1.63 per instance, while our version with this sub-agent shows a cost of about $733, or $1.47 per instance. That’s a 10% cost saving, which is definitely worth noting.
One interesting observation is how strongly the results depend on the choice of LLM for the sub-agent. In a smaller experiment, we tried keeping our sub-agent connected to GPT-5 Codex, and that dramatically increased the cost to $3.30 per example, averaged over 50 examples.
Experiment
Success rate
Cost per instance
Part 03 (Langfuse)
56% (278/500)
$1.63 ($814/500)
Part 04 (sub-agent GPT4.1 mini)
58% (290/500)
$1.47 ($733/500)
Part 04 (sub-agent GPT-5 Codex)
58% (29/50)
$3.30 ($165/50)
However, it is interesting to note that we hypothesized two ways to reduce cost. The first was shrinking the context size through the natural compression achieved by task handoffs, and the second was offloading work to a cheaper model. The data suggests that just splitting off a sub-agent (and keeping the GPT-5 Codex model) actually increases the cost significantly, so our first method doesnāt seem to work, while our second (cheaper models) is the one that seems to do the trick ā though this may not be rigorous proof.
As for performance improvements, we see a small uptick from 56% to 58%. This could be within the tolerance of statistical variance, but it’s encouraging that performance at least stayed consistent while we reduced costs.
Conclusion
We’ve seen that creating sub-agents is both simple and potentially powerful. Koog provides convenient tooling to streamline the process even further, leaving only the prompts for the agent-as-a-tool for you to define.
This technique clearly delivers worthwhile cost savings. We achieved nearly a 10% reduction ā a clear, measurable improvement. The performance impact is less clear, but it does look like it might be some gains there, too.
At the same time, these kinds of evaluations are expensive. Even with reduced costs, this benchmark still totaled $730. Thatās why, in the next part, we will take a closer look at another strategy for lowering costs: a more general approach to compression. In it, we’ll answer the question, āHow do you prevent your context from growing indefinitely, and your costs growing with it?ā