Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
155318 stories
·
33 followers

Getting Contact Information with .NET MAUI

1 Share

Let users easily pull contact info (like name, number, email) from one app and save it to their device’s contact list in .NET MAUI.

Let’s be honest, the only “accessory” you take with you everywhere—no matter the activity you’re doing or the type of event it is, and something that never bothers you to carry—is your mobile device.

OK … and what does this mean? It means that a large part of the things you need to do in your day-to-day life are covered by your phone: from super basic operations like calculating amounts and saving contacts, to talking with your friends or even ordering food.

Now, think about this: imagine you have an application with a form where you need to enter contact information—the same information you already saved on your phone when creating that contact. Having to type everything again can feel tedious, right?

This is where things get interesting. A simple action like saving a contact opens the door for multiple applications to obtain that information and work together with the data that already exists on the device. If your app could “pull” that information directly, without the user having to type it again, that would be a win!

So get excited, because in this article I’ll show you how to obtain that contact information to use it in your .NET MAUI apps.

What Do I Need Before I Start?

Before we begin, it’s important to configure a few platform-specific steps. Let’s go through each one, step by step:

Android

To be able to read contacts, you must explicitly request the READ_CONTACTS permission. There are three different ways to add this permission on Android:

Option 1: Add the permission directly in the AndroidManifest.xml

Go to Platforms → Android, open the AndroidManifest.xml file, and add the following node:

<uses-permission android:name="android.permission.READ_CONTACTS" />

Option 2: Use the Android Manifest graphical editor

Go to Platforms → Android, double-click the AndroidManifest.xml file, and locate the Required permissions section. Find the permission labeled READ_CONTACTS and simply check the option, as shown below.

Option 3: Add the assembly-based permission

Go to Platforms → Android → MainApplication.cs and add the permission as follows:

[assembly: UsesPermission(Android.Manifest.Permission.ReadContacts)]

iOS/Mac Catalyst

On iOS and Mac Catalyst, we also need a very similar configuration to be able to access contacts. Follow these steps:

Step 1:
Right-click on Platforms → iOS → Info.plist and on Platforms → MacCatalyst → Info.plist (yes, each configuration is done per platform) and open the file with the editor.

Step 2:
Once the file is open, add the NSContactsUsageDescription key along with its description. This step is mandatory to comply with Apple’s guidelines. Keep in mind that this text must be very precise, as it will be the message shown to the user when the app requests permission to access their contacts.

<key>NSContactsUsageDescription</key>
<string>This app needs access to your contacts to select a contact and retrieve its information.</string>

Windows

Contact picking is not supported on Windows.

Let’s Start!

All set! We’ve prepared the initial configuration to get started!

.NET MAUI provides the IContacts interface, which allows us to select and retrieve information from the contacts stored on the device.

This interface is part of the Microsoft.Maui.ApplicationModel.Communication namespace and is available through the Default property, which gives us access to the ready-to-use implementation provided by .NET MAUI.

Important Considerations for iOS and macOS

There is a namespace conflict because multiple classes named Contacts exist. As a result, if you write only Contacts, .NET doesn’t know exactly which one you are referring to, and you might think there is an issue with your code when in reality you are just using the wrong class.

What’s the Solution?

To avoid this conflict, you must tell .NET exactly where the class you want to use comes from. This is done by using the fully qualified name:

Microsoft.Maui.ApplicationModel.Communication.Contacts

A Cleaner Solution

There is also a cleaner way to handle this. Instead of writing the full name every time you use it, you can create an alias with a using directive, like this:

using Communication = Microsoft.Maui.ApplicationModel.Communication;

This gives the namespace a shorter name. Then, you can use that alias directly in your code:

var contact = await Communication.Contacts.Default.PickContactAsync();

What Contact Information Can We Retrieve Exactly?

Let’s take a look at the data that can be retrieved from a contact. The following are the available fields:

  • Id: The internal identifier of the contact on the device.
  • NamePrefix: The name prefix (Mr., Mrs., Eng., Dr., etc.)
  • GivenName: The contact’s first name.
  • MiddleName: The contact’s middle name.
  • FamilyName: The contact’s last name.
  • NameSuffix: The name suffix (if any), for example, Jr.
  • Phones: The list of phone numbers associated with the contact.
  • Emails: The list of email addresses associated with the contact.
  • DisplayName: The full name displayed on the phone. For example: Eng. Maris Lopez.

Getting Your Device Contact List

To retrieve the full list of contacts stored on your device, you can use the GetAllAsync() method. This method returns a collection containing all available contacts.

Below is an example showing how to retrieve them:

public async IAsyncEnumerable<string> GetContactNames() 
{ 
    var contacts = await Contacts.Default.GetAllAsync(); 
// No contacts 
if (contacts == null) 
    yield break;
 
foreach (var contact in contacts) 
    yield return contact.DisplayName; 
}

But How Can I Select a Specific Contact?

To ask the user to select a specific contact from their device, we use the PickContactAsync() method. This method opens the system’s contact list, where the user can simply choose a contact.

⚠️ If the user does not select any contact or cancels the action, the method returns null.

In the following example, you can see how to retrieve the different pieces of information from the selected contact:

private async void SelectContactButton_Clicked(object sender, EventArgs e) 
{ 
    try 
    { 
    var contact = await Contacts.Default.PickContactAsync(); 
    
    if (contact == null) 
    return; 
    string id = contact.Id; 
    string namePrefix = contact.NamePrefix; 
    string givenName = contact.GivenName; 
    string middleName = contact.MiddleName; 
    string familyName = contact.FamilyName; 
    string nameSuffix = contact.NameSuffix; 
    string displayName = contact.DisplayName; 
    List<ContactPhone> phones = contact.Phones; 
    List<ContactEmail> emails = contact.Emails; 
    }
    
    catch (Exception ex) 
    { 
    // Most likely permission denied 
    } 
}

✍️ Platform Differences You Should Know

Android

  • GetAllAsync does not support the cancellationToken parameter.

iOS / Mac Catalyst

  • GetAllAsync does not support the cancellationToken parameter.
  • The DisplayName property is not supported natively. Instead, its value is constructed using GivenName FamilyName.

Conclusion

And that’s it! In this article, you learned how to work with device contacts in .NET MAUI, from selecting a specific contact to retrieving the full contact list. You also explored key platform differences and important considerations to keep in mind when targeting Windows, Android, iOS and Mac Catalyst.

With this knowledge, you can now integrate contact access into your apps in a clear and reliable way, improving the user experience while handling platform-specific behaviors correctly.

See you in the next article! ‍♀️✨

Reference

Code samples and explanations were based on the official documentation.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

.NET at Microsoft Build 2026: Must watch sessions

1 Share

That’s a wrap on Microsoft Build 2026! From union types in C# to agentic web apps and AI on the edge with .NET MAUI, this year’s event showed how .NET 11 is built for the AI era. Whether you joined live or are catching up on demand, here are the .NET sessions worth your time.

Tip

Want to binge the whole lineup? Watch the .NET at Build 2026 playlist on the .NET YouTube channel. For the full list of announcements across all of Microsoft, check out the Build 2026 Book of News.

Featured .NET Sessions

Union types in C#

Union types are coming to C#! Unions model closed sets of data shapes, as commonly seen in wire protocols and domain modeling. Mads and Dustin explore the clean expression of intent and the confidence and elegance that unions lend to consuming code. This is one of the most requested language features, and it’s finally landing.

Watch the session →

.NET 11 in depth: Runtime, libraries, and SDK for the AI era

.NET 11 delivers a new wave of improvements across the runtime, libraries, and SDK to help developers build modern applications for the AI era. This session takes an in-depth look at the key investments in performance, diagnostics, and developer productivity, and how they come together to support intelligent, cloud-connected, and agent-driven apps.

Watch the session →

AI Building Blocks for .NET: Add intelligence to your C# apps

A practical, opinionated guide to building intelligent apps in .NET. This session walks you through the building blocks you need to add AI capabilities to your C# applications, from model integration to agent patterns, all with production-ready code you can use today.

Watch the session →

Building for the agentic web with .NET 11

The demands on modern web apps are increasing. Users expect more performance, airtight security, and even agentic capabilities. In .NET 11, ASP.NET Core and Blazor are getting faster and more secure at the core, closely integrated with Aspire for distributed app development, and gaining a new set of building blocks (agents, tools, skills, and components) for building agentic web apps. Get ready to build for the modern agentic web!

Watch the session →

Taking your AI to the edge with .NET MAUI

AI is transforming both what we build and how we build it. Learn how .NET MAUI developers can bring AI to the edge using local models and on-device capabilities across mobile and desktop. This session covers the impact on privacy, performance, and UX, explores .NET 10 features and what’s coming in .NET 11, and shows how AI-powered tools and agentic workflows can accelerate app development.

Watch the session →

Simplifying .NET Installs with dotnetup

A new way to manage .NET SDK and Runtime installations that works for every user, on every platform! The dotnetup tool simplifies getting started with .NET and keeping your installations up to date, making the developer onboarding experience smoother than ever.

Watch the session →

Related Sessions

Looking for more? These related sessions from Build 2026 cover topics that intersect with .NET development.

Visual Studio

Aspire

Windows development

Microsoft Foundry

  • From prototype to production build and run agents at scale: AI agents are transforming how developers build software, but shipping production-grade agents demands more. This session walks through the end-to-end lifecycle of building AI agents with Foundry Agent Service and Microsoft Agent Framework. See how to go from local prototyping to enterprise-grade hosted deployment with identity, secure networking, evaluations, and lifecycle management. Learn how coding agents like GitHub Copilot integrate directly into the workflow.
  • Claw and agent harness in Microsoft Foundry: Go deep on multi-agent systems built on Microsoft Foundry, featuring Claw agent patterns and the hosted agents architecture. See how coding agents integrate into multi-agent workflows using Microsoft Agent Framework.

GitHub Copilot SDK

App modernization

📚 Get Started

Ready to dive in? Here’s how to get started with everything announced at Build 2026.

Important

Many of the features shown in these sessions are available today in .NET 11 previews. Download the latest preview and start building with the new capabilities now!

The post .NET at Microsoft Build 2026: Must watch sessions appeared first on .NET Blog.

Read the whole story
alvinashcraft
14 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Enterprise Live Migrations: Moving from Azure DevOps Repo to GitHub with minimal disruption

1 Share

Over the last several years, we’ve encouraged customers to move their repositories from Azure Repos to GitHub to take advantage of the latest AI-powered and agentic development experiences.

For many enterprise teams, however, migrating at scale comes with real constraints. Traditional approaches can require extended downtime – sometimes days – which isn’t acceptable for teams running critical workloads.

To address this, we’re introducing Enterprise Live Migrations (ELM), in limited public preview.

Migrations begin without locking the Azure DevOps repository, with changes continuously synchronized to GitHub while developers keep working. When ready, teams can schedule a cutover to complete the transition – with only a brief downtime window, typically under 30 minutes. This means no extended freeze periods, no multi-day outages – just a controlled, predictable transition that fits into your operations. Teams can migrate at their own pace, without coordinating complex, high-risk “all-at-once” migrations.

🪧 Sign up for the Preview

ELM currently supports migrations to GitHub Enterprise Cloud with data residency. A script-based migration experience is available today, with a UI-based experience coming soon to provide a more streamlined end-to-end workflow. We expect to remain in limited public preview over the next couple of months as we continue refining the experience, adding new features, and incorporating customer feedback. Your input is vital to making this experience successful.

If you are interested in participating in the preview, you can sign up today. We will follow up with all the information you need to get started.

🤖 How ELM works

ELM follows a simple, staged workflow:

  • Start and validate — ensure the repository is migration-ready
  • Continuous sync — keep GitHub up to date while development continues
  • Cutover — perform a final sync and transition GitHub to the system of record

During most of the process, Azure DevOps remains fully writable, so teams can keep working without interruption.

For detailed guidance, learn more here.

Screenshot 2026 06 01 182500 image

Screenshot 2026 05 20 at 10 47 43 AM 1 image

💬 Conclusion and key takeaways

Enterprise Live Migrations provides a practical path for organizations moving to GitHub Enterprise Cloud with data residency:

  • Minimize disruption with continuous synchronization and a short cutover window
  • Reduce risk by migrating without pausing active development
  • Adopt flexibly with support for hybrid Azure DevOps and GitHub workflows

Learn more and sign up for preview.

The post Enterprise Live Migrations: Moving from Azure DevOps Repo to GitHub with minimal disruption appeared first on Azure DevOps Blog.

Read the whole story
alvinashcraft
34 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

WWDC 2026: All the news from Apple’s developers conference

1 Share

Apple’s annual WWDC event is kicking off on June 8th with a keynote presentation starting at 1PM ET / 10AM PT, where Apple will announce major updates to iOS, macOS, and its other operating systems. 

Among those updates could be Apple’s delayed Siri overhaul, which has faced setbacks since it was initially announced at WWDC 2024. Apple is revamping Siri with some help from Google Gemini and is also rumored to have a dedicated Siri app in the works. 

This is also the last keynote we’re expecting to see from Tim Cook before he steps down as CEO on September 1st and is replaced by John Ternus, who currently leads hardware engineering.

Follow along here for all the latest news and updates. 

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

GitHub for Beginners: Answers to some common questions

1 Share

Welcome back to GitHub for Beginners. This is the final episode of the season, and we’ve covered a lot so far. Make sure to check out our other episodes to see all the various topics we’ve discussed.

Today, we’re going to spend some time answering some questions that people often have, especially when they’re first getting started. So without further ado, let’s jump right in.

As always, if you prefer to watch the video or want to reference it, we have all of our GitHub for Beginners episodes available on YouTube.

What is SSH and how do I add my SSH key to GitHub?

An SSH key is a secure shell key. It’s a pair of files on your computer that has two parts: a private key and a public key.

The private key stays on your computer and should never be shared. The public key is what you share with platforms like GitHub. When you store your public key on GitHub, git uses your private key to confirm your identity when you push and pull code. In order for you to be authenticated, your public key on GitHub needs to match the private key on your computer.

So how do you do this? Let’s create a key pair and add your public key to GitHub now.

(And remember, if you prefer a video walkthrough, that is available.)

  1. Open up a terminal and enter the following command. Remember to replace the email placeholder with your email address you use to log into GitHub.
ssh-keygen –t ed25519 – C YOUREMAIL@DOMAIN.COM
  1. When it prompts you to enter a file to save the key, press Enter to accept the default file and location.
  2. Enter a passphrase that you’ll remember. Note that the terminal will not display what you type, so be careful not to have any typos!
  3. Reenter your passphrase.

This will create your new SSH key. Now you want to add it to your ssh-agent. An ssh-agent is a program that securely stores your keys so that you don’t need to keep entering your passphrase.

🔍 To learn more, check out our docs about adding your SSH key to ssh-agent

To add this new SSH key to the ssh-agent, run the following command. Note that you will need to add your passphrase when prompted.

ssh-add ~/.ssh/id_ed25519

Now that you have created the SSH key and configured your ssh-agent, the next step is adding the public key to GitHub.

  1. In your terminal, run the following command.
cat ~/.ssh/id_ed25519.pub
  1. Copy the entire line that appears in the terminal as a response to that command.
  2. Open a browser and navigate to github.com.
  3. Click your profile picture in the top-right corner and select Settings.
  4. In the menu on the left-hand side, select SSH and GPG keys.
  5. On the right-hand side, click the green New SSH key button.
  6. Give the key that you’re about to add a name in the “Title” box that describes this key in a way you’ll remember. For example, if this is your work laptop you might enter a title of “work-laptop”.
  7. Paste the key you copied from the terminal into the “Key” box.
  8. Click the green Add SSH key button at the bottom of the window.

Congratulations! Your computer is now configured to connect to GitHub over SSH.

How do I add a PAT to GitHub? What is a PAT?

PAT stands for Personal Access Token. A PAT is a special credential that you create on GitHub for tools that need authentication. You control its permissions and can revoke it any time. On GitHub, you’ll commonly use a PAT to authenticate via command line or the GitHub API.

There are two types of PATs available: fine-grained tokens and classic tokens. First we’ll walk through creating a fine-grained PAT.

  1. Open a browser and navigate to github.com.
  2. Click your profile picture in the top-right corner and select Settings.
  3. Scroll to the bottom of the list of options in the left-hand column and select Developer settings.
  4. In the left-hand column, expand the option for Personal access tokens. Select Fine-grained tokens from the options displayed.
  5. Click the green Generate new token button in the center of the window.
  6. Enter a name and description for the token. This should make it clear what the token is going to be used for (e.g., a name of “cli-access” with a description of “access the Copilot CLI”).
  7. Under “Expiration,” select a date that matches how long you need the token to be valid. Once the token expires, it will not work anymore.
  8. Under “Repository access,” select which repositories you want the PAT to be able to access. You can limit the selection to only specific repositories if you know which repositories it will need to access.
  9. Under “Permissions”, click Add permissions to select which permissions you’re granting to this PAT. This lets you define the scope of what the PAT can do.
  10. For each permission, you can specify whether it has read-only access or read and write access.
  11. When you’re satisfied with the permissions, scroll to the bottom and click the green Generate token button.
  12. A window pops up, providing a review of all of the information associated with this token. Verify that the information is correct, and then click Generate token.
  13. GitHub will now show you the token. Make sure that you copy it and store it in a safe location (e.g., a password manager), because GitHub only shows you this token once.
🔍 For more information, check out our documentation about Personal Access Tokens.

Now let’s go through creating a classic token. As you’ll see, it’s very similar in several ways.

  1. Open a browser and navigate to github.com.
  2. Click your profile picture in the top-right corner and select Settings.
  3. Scroll to the bottom of the list of options in the left-hand column and select Developer settings.
  4. In the left-hand column, expand the option for Personal access tokens. Select Tokens (classic) from the options displayed.
  5. In the main window, click Generate new token and select Generate new token (classic).
  6. Give the token a clear name that explains what it will be used for (e.g., “terminal-access”).
  7. Under “Expiration,” select a date that matches how long you need the token to be valid. Once the token expires, it will not work anymore.
  8. Select the scopes for your token. The scopes indicate what access permissions the token grants.
  9. When you’re satisfied with the scopes, scroll to the bottom and click the green Generate token button.
  10. GitHub will now show you the token. Make sure that you copy it and store it in a safe location (e.g., a password manager), because GitHub only shows you this token once.

This creates the classic token. The next time that GitHub asks for your password in a terminal, instead of supplying your password, you could paste this token.

What’s the difference between merging and rebasing, and how do I fix a merge conflict?

A merge conflict is what happens when two changes touch the same part of a file, and git needs your help to decide what the final version should be. There are a few different ways you can resolve this, but we’re just going to walk through it using the GitHub UI.

  1. Open a pull request that has a merge conflict. GitHub will provide a message indicating that there’s a conflict and you won’t be able to automatically merge.
  2. Scroll to the bottom and click the Resolve conflicts button inside the warning about conflicts.
  3. GitHub opens the files that have conflicts. Use the editor to resolve the conflicts by choosing which version of the file to use in each case where there’s a conflict.
  4. Once the file has no more conflict markers (i.e., you’ve addressed every conflict), select Mark as resolved in the top-right.
  5. Repeat this process for every file that has merge conflicts.
  6. After you’ve addressed all the files that have conflicts, click the green Commit merge button at the top of the window.

This updates your pull request with the merged conflict and now you’ll be able to merge that change into your repository. Well done!

Now let’s talk about the difference between merging and rebasing, and when you might want to use one over another.

Merging combines changes from one branch into another by creating a new commit that ties both histories together. It preserves the history of both branches. You should merge when you want to preserve the full history of how work happened. This is commonly used for feature branches that are going to be merged into main, such as when you’re adding new functionality.

On the other hand, rebasing moves or replaces your branch’s commits on top of another branch. It rewrites the history to create a linear and cleaner commit timeline. You should rebase when you want a clean linear history, like when you are updating your feature branch to pull in the latest changes from main. For example, if you are working on a feature, but you want to pull in the latest changes from main before merging.

How do I undo my last commit?

Let’s say that you’re in a situation where you’ve already pushed your commit to your branch, and you want to undo it. You can undo your commit through the GitHub UI.

  1. Open the commit that you want to undo on github.com.
  2. Scroll to the bottom of the commit and select the Revert button.
  3. GitHub creates a new commit that undoes the changes from your previous commit. It’s important to realize that this doesn’t erase the commit history, but rather puts a new change in place that undoes your previous changes. You can now either merge this commit directly or open a pull request with it. Opening a pull request is the safest option when others might be using the branch.

If your changes are local, and you haven’t yet pushed to a branch, you can locally revert your commit by running the following command.

git reset --soft HEAD~1

This removes the commit from your local repository, but keeps your work staged so that you don’t lose any changes. If you would rather undo your changes even locally to reset your workspace, you can use the following command. Just realize that by doing this, you might lose your work!

git reset --hard HEAD~1 

How do I update or sync a forked repository on GitHub?

Forking a repository creates your own copy of a project so that you can explore or make changes to it without affecting the original repository. This is especially important when you want to contribute to an open source project. Here’s how you can fork a repository.

  1. Open the repository that you want to fork on github.com.
  2. Select the Fork button at the top of the repository.
  3. Choose the Owner of this forked version, which in most cases will be your GitHub account.
  4. You may optionally rename the repository by providing a new Repository name. By default, forked repositories keep the names of the upstream repository.
  5. At the bottom of the window, select Create fork.
  6. This creates a full copy of the project under your account. To work on it locally, select the Code button and clone it to your machine.
🔍 You can learn more about forking by checking our documentation.

Now that you’ve created the fork, you want to make sure that you still pull in the latest changes from the upstream repository. Otherwise, your forked copy can quickly become out of date.

  1. Navigate to the main page for your forked repository on github.com.
  2. At the top of the repository, select the Sync fork button.
  3. Select Update branch in the pop-up menu.

When you do this, GitHub automatically pulls in the latest changes from the upstream repository to keep your fork up to date. You can also do this from the command line.

  1. Open a terminal and navigate to your repository.
  2. Set the upstream repository. Make sure to update the URL in the following command with your original repository URL.
git remote add upstream YOUR_ORIGINAL_REPOSITORY_URL
  1. Pull in the latest changes.
git fetch upstream
  1. Merge the latest changes into your project. Note that the following command assumes that the upstream project uses main as the default branch. If it uses something else, you will need to use that branch in the following command.
git merge upstream/main
  1. This updates your local copy with all of the changes to the upstream branch. So now, you need to push them to GitHub to make sure your repository is synchronized.
git push origin main

Now you know how to work in your own copy of a project and keep your work synchronized!

How do I review a pull request on GitHub?

A pull request (often abbreviated PR) is a place to share code and talk about changes. Here are three helpful practices to keep in mind when you’re reviewing a pull request.

  1. Start by understanding the goal of the pull request. Open the pull request and read the description. See if it has an associated issue, any screenshots, or notes from the author. Knowing the purpose helps you know why the changes exist and what you’re looking for.
  2. Review the code changes in small sections. Open the Files changed tab and move through the changes one group at a time. If something isn’t clear, leave a comment on that line. Ask questions, offer suggestions, or let the author know if you see a better approach. Keep your comments specific so they know exactly what you’re referencing. It might help to open the code on your machine either via the command line or in a codespace to run the code yourself to ensure you understand. Use terms like “nit” if your comment is not a necessary suggestion for merging the pull request.
  3. Highlight what’s going well. When you see code that’s well organized, thoughtful, or teaches you something, mention it! Positive feedback reinforces good patterns and helps teammates feel supported.
🔍 Learn more about reviewing pull requests by taking a look at our documentation on the topic.

When everything looks ready, use the Submit review button to approve the changes or request updates.

Copilot code review can also help you understand pull requests and suggest improvements. Note that in order to use Copilot, your organization admin needs to enable Copilot for either your repository or your user account. Once Copilot is enabled, you don’t need to install anything special—Copilot code review will automatically appear as an option in pull requests.

  1. Open a pull request on github.com where you want to use Copilot code review.
  2. Select Reviewers in the top-right.
  3. Select Copilot from the list of suggested reviewers.

In a short amount of time, Copilot will complete its review. You can scroll down and see the comments left by Copilot. It always leaves a “Comment” type of review, not an “Approve” or “Request changes” type of review. This means that Copilot reviews do not count toward required approvals nor will they block merging changes.

🔍 Learn more by checking out our Copilot code review documentation.

Next steps

And that’s a wrap! With this episode, we’ve finished another season of GitHub for Beginners, ending with some of the most common questions we’ve seen or heard. We hope that you found this information helpful, and don’t forget to check out our full library of GitHub for Beginners topics.

Happy coding!

The post GitHub for Beginners: Answers to some common questions appeared first on The GitHub Blog.

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Long-Running Agents

1 Share

The following article originally appeared on Addy Osmani’s blog and is being reposted here with the author’s permission.

A long-running AI agent can keep making progress over hours, days, or weeks. It can do this across many context windows and sandboxes, recover from failure, leave structured artifacts behind, and resume where it left off.

For two years the dominant image of an “AI agent” has been a chat window with a clever loop in it. You type a goal; the agent calls some tools; you watch tokens stream by; you stop watching when the work runs out of patience or the context window fills up. That paradigm got us a long way, but it has a ceiling. The model forgets. It declares “task complete” when it isn’t. It reintroduces a bug it fixed nine turns ago. The whole thing is structured around a single sitting.

Long-running AI agents

Long-running agents are what comes next. The idea is easy to state: an agent that keeps making forward progress on a goal across many sessions and many sandboxes, possibly many days or weeks, while leaving the workspace clean enough that the next session can pick up where the last one left off. The engineering is harder. You have to solve for persistence, recovery, and verification in a way that doesn’t just paper over the cracks. You have to build a state layer that lives outside the model’s context window, and you have to design the handoff between sessions so the agent doesn’t lose its mind when it wakes up and finds itself in a different sandbox with a different context window.

This post is my attempt to lay out what’s changed, who’s pushing on it, and how an engineer can use long-running agents today without writing the whole thing from scratch.

What “long-running” actually means

“Long-running” used to mean at least three different things in practice, and it helps to keep them separate.

Long-horizon reasoning. The agent has to plan and execute over many dependent steps. This is mostly a model-quality story: coherence, planning, the ability to recover from a wrong turn 10 steps ago. METR has been tracking this with their time horizon metric, which estimates how long a task a frontier model can complete with 50% reliability. The headline finding is that the metric has been doubling roughly every seven months since 2019, and their TH1.1 update earlier this year doubled the count of eight-hour-plus tasks in the eval set. If that curve holds, frontier agents complete tasks at the day scale by 2028 and the year scale by 2034.

Long-running execution. The agent’s process runs for hours or days. Maybe it’s a coding job, maybe it’s a research sweep, maybe it’s a 24-7 monitoring service. The model might be invoked thousands of times across the run. This is mostly a harness story, and it’s the one this post is mostly about.

Persistent agency. The agent has an identity that outlives any single task. It accumulates memory, learns user preferences, and is always available. This is the Memory Bank flavor of long-running.

In practice the three blur together. A real production agent does long-horizon reasoning inside a long-running execution backed by persistent agency. But the engineering problems are different in each, and so are the products that solve them.

Why this matters

There are two reasons I believe this work matters a lot right now.

The first is a phase change in what’s economically feasible to delegate. An agent that runs for 10 minutes can answer a question, summarize a doc, fix a small bug. An agent that runs for 10 hours can own an entire feature, finish a migration that was on the backlog for six quarters, or do the kind of overnight research sweep that used to require a junior analyst. One of Anthropic’s Claude Sonnet announcements put concrete numbers on this last fall: 30+ hours of autonomous coding in internal tests, including one run that produced an 11,000-line Slack-style app. That’s already past the threshold where the answer to “Should I delegate this?” is no longer obvious.

The second is that persistence changes what the agent is. A stateless agent answers your question and disappears. A long-running one accumulates context: which competitor moved which way last week, which test flaked twice on Tuesday, what you usually mean by “the dashboard.” Anthropic’s Project Vend was the most public early demonstration of this. They had a Claude instance run an actual office vending business for a month, managing inventory, setting prices, talking to suppliers. It failed in informative ways, and the second phase ran much better, but the point wasn’t profitability. The point was watching what kinds of weird coherence problems show up when an agent has to maintain identity across weeks instead of turns.

Those are the same problems every team building production agents now hits.

The three walls every long-running agent hits

Three walls show up in basically every write-up I’ve read this year.

Finite context. Even a 1M-token window fills. And context rot, the steady degradation of model performance as the window gets full, kicks in well before the hard limit. A 24-hour run is not going to fit in any context window the field has on its roadmap. Something has to give.

No persistent state. A new session starts blank. Anthropic’s framing in their scientific computing post is the cleanest version I’ve seen: “Imagine a software project staffed by engineers working in shifts, where each new engineer arrives with no memory of what happened on the previous shift.” Without an explicit persistence story, every shift change is a productivity disaster.

No self-verification. Models reliably skew positive when they grade their own work. Asked “Are you done?” they answer “yes” more often than they should. Without a separate signal that the work meets a bar, you get the agent that ships at 30% complete with full confidence.

Long-running agent designs are mostly answers to these three problems. The major labs have converged on similar shapes of answer, but with very different surface area.

The Ralph loop: One of the simpler practitioner versions of long-running agents

The Ralph loop (sometimes called the Ralph Wiggum technique) is one of “simpler” practitioner version of long-running agents, popularized by Geoffrey Huntley and Ryan Carson. The reference implementation is literally a bash script that loops:

  1. Pick the next unfinished task from a list (prd.json or equivalent).
  2. Build a prompt with the task, the relevant context, and any persistent notes.
  3. Call the agent.
  4. Run tests or other checks.
  5. Append what happened to progress.txt.
  6. Update the task list (done, failed, blocked).
  7. Go back to step 1.

The reason it works is the same reason any of the harnesses below work: State lives outside the agent’s context. prd.json is the plan, progress.txt is the lab notes, and AGENTS.md is the rolling rulebook. The agent itself is amnesiac, but the filesystem isn’t. Each iteration starts fresh and reads enough state from disk to keep going. Carson’s Compound Product extends the idea by chaining multiple loops (an analysis loop that reads daily reports, a planning loop that emits a PRD, an execution loop that writes the code), which is roughly the open source version of the planner-generator-evaluator triad Anthropic landed on independently.

I went deeper on all of this in “Self-Improving Coding Agents”: task list structure, progress files, QA gates, monitoring, the failure modes you’ll actually hit. The short version is that you can build a working long-running agent in an evening with a bash script and a JSON file. Most of what Google and Anthropic have productized is the work of making this pattern recoverable, secure, and observable at scale.

The big-lab stories below are different ways of paying for that production-readiness.

Anthropic: Harnesses, then the brain/hands/session split

Anthropic has been the most public about the engineering. Two posts are worth reading end to end.

The first is “Effective Harnesses for Long-Running Agents,” which lays out a two-agent harness for autonomous full stack development. An initializer agent runs once at the start of a project to set up the environment, expand the prompt into a structured feature-list.json, and write an init.sh that future sessions will run on boot. A coding agent is then woken up over and over, each session asked to make incremental progress on one feature, run tests, leave a claude-progress.txt note, and commit. A test ratchet (“it is unacceptable to remove or edit tests because this could lead to missing or buggy functionality”) sits in the prompt to stop the very common failure of an agent deleting failing tests to “make them pass.” InfoQ’s writeup extends this into a planner, generator, and evaluator triad, on the same logic that separating generation from evaluation matters because models grade their own work too generously.

The second is “Scaling Managed Agents: Decoupling the Brain from the Hands,” the architectural post behind Claude Managed Agents (Anthropic’s hosted runtime, launched in early April). The argument is that an agent has three components that should be independently replaceable. The Brain is the model and the harness loop that calls it. The Hands are sandboxed, ephemeral execution environments where tools actually run. The Session is an append-only event log of every thought, tool call, and observation.

This sounds abstract, but it isn’t. Here’s Anthropic’s framing: “Every component in a harness encodes an assumption about what the model can’t do on its own.” When you couple them, an assumption that goes stale (e.g., the model used to need an explicit planner and now plans natively) means the whole system has to change at once. When you decouple them, the harness becomes stateless, sandboxes become cattle, not pets, and a brain crash doesn’t lose the run. A fresh container calls wake(sessionId) and reconstitutes the state from the log. They reported time-to-first-token dropped ~60% at p50 and over 90% at p95 just from being able to start inference before the sandbox is ready.

The session-as-event-log idea is the part most teams underappreciate. It is what makes a long-running agent recoverable. Without it, a container failure is a session failure and you’re debugging into a stale snapshot. With it, the agent’s memory is a queryable artifact that lives outside whatever process happens to be running at the moment.

For the scientific computing crowd, Anthropic’s “long-running Claude” post reduces all of this to a simpler stack: CLAUDE.md as a living plan the agent edits as it learns, CHANGELOG.md as portable lab notes, tmux plus SLURM plus git as the execution and coordination layer, and the Ralph loop, a for loop that kicks the agent back into context whenever it claims completion and asks if it’s really done. Their flagship case study is a Boltzmann solver Claude Opus 4.6 built over a few days that reached subpercent agreement with a reference CLASS implementation. Months to years of researcher time, compressed.

Same patterns across all three posts: an explicit plan file, an explicit progress file, structured handoffs between sessions, separate generation from evaluation, and a loop that refuses to let the agent stop early.

Cursor: Planners, workers, judges

Cursor’s “Scaling Long-Running Autonomous Coding” is the other essential read this year. They walked into walls that Anthropic mostly papered over.

Their first attempt was a flat coordination model: equal-status agents writing to shared files with locks. It became a bottleneck and made the agents risk averse, churning rather than committing. Their second attempt swapped locks for optimistic concurrency control, which removed the bottleneck but didn’t fix the coordination problem. The third design is what’s running in production now and what they describe as solving most of the problem:

  • Planners continuously explore the codebase and emit tasks. They can recursively spawn subplanners.
  • Workers are focused executors. They don’t coordinate with each other and they don’t worry about the big picture.
  • Judges decide when an iteration is finished and when to restart.

Two things stand out from the post. One: “A surprising amount of the system’s behavior comes down to how we prompt the agents” more than the harness or the model. Two: Different models slot into different roles. Their reported finding is that a GPT model was better than Opus for extended autonomous work specifically because Opus tended to stop early and take shortcuts. Same task, different role, different model. The matching is becoming part of the design surface.

This pairs with Composer 2 (their proprietary frontier coding model that ships in Cursor 3) and their background cloud agents: long-running tasks that run on Anysphere’s cloud infrastructure rather than your laptop. Eight-hour refactors and codebase-wide migrations survive a closed lid. You can start a task locally, hit run in cloud when you realize it’ll take 30 minutes, and reattach later from your phone. Each agent runs in an isolated Git worktree and merges back via PR. The handoff between local and remote is the part most teams haven’t figured out yet, and Cursor’s bet is that it has to be its own product surface.

The shape ends up close to Anthropic’s: Roles are split, sessions are durable, judges sit beside the worker, and a long task runs in a cloud sandbox with Git as the coordination substrate.

Google: Long-running agents on the Agent Platform

Google’s announcement at Cloud Next ’26 folded Vertex AI into the Gemini Enterprise Agent Platform and turned long-running agents into a named product, with named SLAs.

The pieces that matter for this post:

  • Agent Runtime supports agents that “run autonomously for days at a time” with sub-second cold starts and on-demand sandbox provisioning. The launch post’s example use case is a sales prospecting sequence that takes a week to play out, which is roughly the right shape for it.
  • Agent Sessions persist conversation and event history. You can pin them to a custom session ID that maps to your own CRM or DB record, so the agent’s state lives next to the business state instead of in a separate AI silo.
  • Agent Memory Bank is the persistent long-term memory layer, generally available as of Next ’26. It curates memories from sessions, scopes them to a user identity, and exposes a search API so the next agent invocation can pull what’s relevant. Payhawk reported that auto-submitting expenses through a Memory Bank-backed agent cut submission time by over 50%.
  • Agent Sandbox handles hardened code execution.
  • Agent-to-Agent Orchestration, Agent Registry, Agent Identity, Agent Gateway, Agent Observability, and Agent Simulation cover basically every operational concern you’d otherwise build by hand for a production fleet, including the cryptographic-identity-and-audit-log story enterprises actually need to ship.

Architecturally this is the same brain/hands/session split Anthropic described, just productized at platform scale and bundled with ADK (the code-first dev kit) and Agent Studio (the visual one). If you’re building inside Google Cloud, you don’t have to design a session log or a memory store from scratch anymore. You wire an ADK agent into Memory Bank and Sessions, deploy onto Agent Runtime, and the persistence question is answered.

Notice how much this looks like the pattern Anthropic and Cursor describe, just unbundled into named services with SLAs. Three years ago you’d have built all of this yourself. Now you pick which version of “decoupled brain, hands, and session” you want to rent.

Five patterns for long-running agents in production

Shubham Saboo and I wrote up five design patterns we’ve seen separate working long-running agents from demos. They aren’t Google-specific, but they map cleanly onto the primitives Agent Runtime now exposes, so it’s worth walking through them here in shortened form.

Checkpoint-and-resume. The most common multiday failure is context loss. An agent processes 200 documents over four hours, hits an error on document 201, and without a checkpoint you start from scratch. Treat the agent like a long-running server process: write intermediate state to disk, checkpoint every N units of work, recover from failures. The Agent Runtime sandbox gives you a persistent filesystem, but choosing the right checkpoint granularity (not every step, not only the end) is on you.

Delegated approval (human-in-the-loop). Most “human-in-the-loop” implementations are: serialize state to JSON, fire a webhook, hope someone responds. The state goes stale, the notification gets buried, the agent re-deserializes into a slightly different world. Long-running runtimes let the agent pause in place with full execution state intact: reasoning chain, working memory, tool history, pending action. Hours of human time pass, the agent consumes zero compute, and it resumes with subsecond latency. Mission Control is Google’s inbox for this. The pattern works regardless of vendor.

Memory-layered context. A seven-day agent needs more than session state. Memory Bank handles long-term curated memory, Memory Profiles add low-latency lookups, and the failure mode you’ll hit in production is memory drift: The agent learns a procedural shortcut from a few atypical interactions and starts applying it broadly. Govern memory like you govern microservices. Agent Identity controls who can read and write which banks. Agent Registry tracks which version of which agent is running. Agent Gateway enforces policy on the wire. The auditing question stops being “What are my agents doing?” and becomes “What are my agents remembering, and how is that changing their behavior?”

Ambient processing. Not every long-running agent talks to a human. Some sit on a Pub/Sub stream or a BigQuery table and act on events as they arrive: content moderation, anomaly detection, inbox triage. The architectural decision worth making early is to not hardcode policy into the agent. Define it in the Gateway and the fleet picks up policy changes without redeploys. Ambient agents run unsupervised for long stretches, and the only sane way to update a hundred of them is to update the policy layer once.

Fleet orchestration. In real systems, you rarely have one agent. A coordinator delegates subtasks to specialists (a Lead Researcher Agent, a Scoring Agent, an Outreach Agent), each running independently for different durations. Each specialist gets its own Identity (so the Outreach Agent can’t read financial data meant for Scoring), its own policy enforcement, its own Registry entry. This is the same coordinator/worker shape distributed systems have used for decades. What’s new is that ADK handles it declaratively with graph-based workflows, and a bad deployment in one specialist doesn’t cascade to the others.

The patterns compose. A compliance system might use checkpointing for document processing, delegated approval for review gates, memory layering for cross-session knowledge, and fleet orchestration to coordinate the specialists. The opening question is always the same: What’s the longest uninterrupted unit of work your agent needs to perform? Minutes, and you don’t need long-running agents. Hours or days, and these patterns are where to start. The full write-up with code samples covers each pattern in depth.

So how do you actually build one today?

This is the practical question, and it has a different answer depending on what you’re building.

You’re a developer who wants long-running coding work on your own repo. Just use Claude Code (or Antigravity, Cursor, or Codex). The harness is already there. Treat your AGENTS.md like a pilot’s checklist: short, every line earned by a real failure. Add hooks for typecheck and lint that surface failures back to the agent. Write a plan file before the agent starts. Use the Ralph loop when the agent claims it’s done and you don’t believe it. For multihour or overnight jobs, run in a worktree so a closed laptop doesn’t kill the run, and have it commit progress every meaningful unit of work. This is the path most people should take, and it’s where the most leverage is right now.

You’re building a hosted agent product. Don’t build the runtime. Pick a managed one. The three real options today: Google’s Agent Platform (Agent Engine + Memory Bank + Sessions), Claude Managed Agents, or roll something on top of ADK, the Claude Agent SDK, or Codex SDK and host it yourself. The trade-off is the usual one. Managed gets you the brain/hands/session split, observability, identity, and an audit trail out of the box. Self-hosted gets you control and the ability to use weird models for weird roles (Cursor’s pattern). For most teams, the right starting point is a managed runtime plus your own ADK or SDK code for the actual loop.

You’re doing something autonomous and operational (monitoring, research, ops). Memory Bank-style persistence is what you want, and it’s the part that doesn’t exist in Claude Code. ADK + Memory Bank + Cloud Run + Cloud Scheduler is the cleanest stack I’ve seen for “agent runs every N hours, accumulates state, alerts on a threshold.” This is also where Cursor’s planner/worker/judge split starts to matter more than it does for IDE coding, because the work is genuinely parallel and the failure modes are different.

A few things matter regardless of which path you take.

Write down the done condition before the agent starts. This is the single highest-leverage move for long runs. The Anthropic harness post calls it the feature list; Cursor calls it the planner’s task spec. Either way, it’s an external file with explicit, testable completion criteria, and it exists so the agent can’t quietly redefine done midrun.

Separate the evaluator from the generator. Self-grading is the failure mode. A planner/worker/judge pipeline, or a generator/evaluator pair, is a real architectural pattern, not a stylistic preference. Even if it’s the same model in different roles with different prompts.

Invest in the session log, not just the prompt. The append-only event log is what makes the agent recoverable, debuggable, and auditable. If you can’t reconstruct what the agent did in the last 24 hours from durable storage, what you have is a long-running shell script that happens to call an LLM, not a long-running agent.

Treat compaction and context resets as first class. Anthropic is explicit that summarization-as-compaction wasn’t enough for very long jobs; they had to do full context resets where the harness tears the session down and rebuilds it from a structured handoff file. It is essentially how humans onboard a new engineer.

There are some real limitations right now

A few things are still genuinely unsolved.

Cost. A 24-hour run with a frontier model and a few tools is not cheap. Without budgets, circuit breakers, and a hard cap on tool spend, an agent can quietly burn through a week’s API budget in an afternoon. This is solvable, but it’s an explicit step you have to take.

Security. A long-running agent with API keys, cloud access, and the ability to run shell commands has a much larger attack surface than a chat session. The brain/hands separation pattern matters here too: Credentials should be unreachable from the sandbox where model-generated code runs, which is one of the benefits Anthropic calls out for Managed Agents.

Alignment drift. Over many context windows, agents drift. The original goal gets summarized, then resummarized, then loses fidelity. This is the part hooks and judges exist to defend against. It is also the most common reason “the agent went off and did something I didn’t ask for.”

Verification. Auditing 24 hours of autonomous activity is a real human-time problem. Observability and structured artifacts (PRs, commits, briefings, test runs) are how you make this tractable. Without them, you’re scrolling logs and you’ll miss what matters.

The human role. This is the one I keep coming back to. Defining work crisply enough that an agent can run for a day on it is harder than doing the work yourself. The skill that’s appreciating in value isn’t writing code. It’s writing specs that survive contact with an autonomous executor.

Where this is going

Google, Anthropic, and Cursor have converged on roughly the same shape. Separate the model loop from the execution sandbox from the durable session log. Split planning from generation from evaluation. Bake in compaction, hooks, and context resets. Expose memory as a managed service that any agent invocation can query.

Surface area is what differs. Google’s Agent Platform is the enterprise-stack version, with the identity and audit trail story baked in. The patterns underneath are the same. Claude Managed Agents is “Anthropic’s harness, hosted.” Cursor’s background agents are “long-running coding, pulled out of the IDE and into the cloud.”

The harder problems for the next year aren’t in any of those layers individually. They’re in the coordination above them. Many long-running agents on a shared codebase. Agents that read their own traces and patch their own harnesses. Harnesses that assemble tools and context just in time for a task instead of being preconfigured at startup. That’s where the agent stops looking like a smarter chat window and starts looking like a colleague who’s been on the project longer than you have.

The model is still load-bearing. But the gap between a chat window and an agent you can leave running overnight is mostly in the state, sessions, and structured handoffs wrapped around it. That’s where I’d spend my learning time right now.



Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories