The AI industry is largely failing to ask a key design question, argues theoretical neuroscientist/cognitive scientist Vivienne Ming. Are their AI products building human capacity or consuming it?
In the Wall Street Journal Ming shares her experiment about which group performed best at predicting real-world events (compared to forecasters on prediction market Polymarket) — AI, human, or human-AI hybrid teams.
The human groups performed poorly, relying on instinct or whatever information had come across their feeds that morning. The large AI models — ChatGPT and Gemini, in this case — performed considerably better, though still short of the market itself. But when we combined AI with humans, things got more interesting. Most hybrid teams used AI for the answer and submitted it as their own, performing no better than the AI alone. Others fed their own predictions into AI and asked it to come up with supporting evidence. These "validators" had stumbled into a classic confirmation bias-loop: the sycophancy that leads chatbots to tell you what you want to hear, even if it isn't true. They ended up performing worse than an AI working solo.
But in roughly 5% to 10% of teams, something different emerged. The AI became a sparring partner. The teams pushed back, demanding evidence and interrogating assumptions. When the AI expressed high confidence, the humans questioned it. When the humans felt strongly about an intuition, they asked the AI to come up with a counterargument... These teams reached insightful conclusions that neither a human nor a machine could have produced on its own. They were the only group to consistently rival the prediction market's accuracy. On certain questions, they even outperformed it...
We are building AI systems specifically designed to give us the answer before we feel the discomfort of not having it. What my experiment suggests is that the human qualities most likely to matter are not the feel-good ones. They're the uncomfortable ones: the capacity to be wrong in public and stay curious; to sit with a question your phone could answer in three seconds and resist the urge to reach for it. To read a confident, fluent response from an AI and ask yourself, "What's missing?" rather than default to "Great, that's done." To disagree with something that sounds authoritative and to trust your instinct enough to follow it. We don't build these capacities by avoiding discomfort. We build them by choosing it, repeatedly, in small ways: the student who struggles through a problem before checking the answer; the person who asks a follow-up question in a conversation; the reader who sits with a difficult idea long enough for it to actually change one's mind. Most AI chatbots today default to easy answers, which is hurting our ability to think critically.
I call this the Information-Exploration Paradox. As the cost of information approaches zero, human exploration collapses. We see it in students who perform better on AI-assisted tasks and worse on everything afterward. We see it in developers shipping more code and understanding it less. We are, in ways that feel like progress, slowly optimizing ourselves out of the loop.
The author just published a book called " Robot-Proof: When Machines Have All The Answers, Build Better People." They suggest using AI to "explore uncertainty.... before you accept an AI's answer, ask it for the strongest argument against itself."
And they're also urging new performance benchmarks for AI-human hybrid teams.
OpenAI’s GPT-5.5 will be generally available tomorrow in Microsoft Foundry, bringing OpenAI’s latest frontier model to Azure and the enterprise teams building agents for real production work.
GPT-5.5 continues a clear progression in the GPT-5 series. GPT-5 brought unified reasoning and speed into a single system. GPT-5.4 brought stronger multi-step reasoning and early agentic capabilities for enterprise use. GPT-5.5 advances this arc with deeper long-context reasoning, more reliable agentic execution, improved computer-use accuracy, and greater token efficiency—designed for sustained, high-stakes professional workflows.
Powerful models alone aren’t enough to operationalize agentic AI at scale. Microsoft Foundry provides the platform layer that turns frontier models into usable, governable systems that enable enterprises to apply security policy and management at the platform level. Foundry is a unified, interoperable environment to build, optimize, and deploy AI applications and agents with confidence. Customers benefit from broad model choice, open and flexible agent frameworks, native integration with enterprise systems and productivity tools, and enterprise-grade security, compliance, and governance. When new models like GPT-5.5 become available, Foundry makes it easy to evaluate, productionize, and scale them without friction.
GPT-5.5 is built for professional scenarios where precision, reliability, and persistence matter. GPT-5.5 Pro, a premium variant, extends reasoning depth and task complexity for the most demanding enterprise workloads.
Improved agentic coding and computer-use: Executes multi-step engineering tasks end-to-end—holding context across large systems, diagnosing the root cause of ambiguous failures at the architectural level, and reasoning through what else in the codebase a fix will affect before making a move. It anticipates downstream testing and review requirements without needing to be told, and navigates software interfaces with improved precision and more reliable recovery when execution takes an unexpected turn.
Autonomous execution and research depth: Goes beyond code to handle the full span of professional work—producing polished deliverables like documents, spreadsheets, and presentations. For research-intensive workflows, GPT-5.5 operates as an active collaborator across the entire arc from question to output: refining drafts across multiple passes, stress-testing analytical reasoning, proposing approaches, and synthesizing across documents, data, and code to drive work forward rather than just answering it.
Complex reasoning and long-context analysis: Handles extensive documents, codebases, and multi-session histories without losing the thread.
Token efficiency built for scale:GPT-5.5 reaches higher-quality outputs with fewer tokens and fewer retries—lowering cost and latency for production deployments at scale.
GPT-5.5 is particularly well suited for domains where the cost of imprecision is high—such as software engineering, DevOps, legal, health sciences, and professional services. With GPT-5.5 in Microsoft Foundry, customers can pair OpenAI’s latest frontier model with enterprise-grade infrastructure to put agentic AI into production.
Microsoft Foundry: The operating system for GPT-5.5 agents at scale
Access to a frontier model is just the starting point. What we see from customers is that the hard part isn’t building an agent: it’s running thousands of them in production, with real isolation, identity, and governance. That’s where Foundry Agent Service comes in.
A revolution is unfolding in the market. Now, a developer can reliably reason through a business problem with a coding agent—human’s interact with a model doing heavy thinking, research, asking questions — and the output is a production agent: a declarative workflow suitable for a specific task, and connected to your business systems.
These declarative agents can be defined in YAML or written in a harness like Microsoft Agent Framework, GitHub Copilot SDK, or virtually any library. With hosted agents in Foundry Agent Service, LangGraph, Claude Agent SDK, and OpenAI Agents SDKs all work the same way. Engineers can run a single command to land agents in an isolated sandbox with a persistent filesystem, a distinct Microsoft Entra identity, and scale-to-zero pricing. Enterprise ready agents, at scale, powered by GPT-5.5.
Should you build your own mobile release tooling? Engineers from Monzo, Spotify, Etsy, and Tuist share how they made the call, what it actually cost, and whether AI changes the math. Live discussion, May 28 at 10 am PT/1pm ET. Hosted by Runway.
Join Security Researcher and Pentester, Jan Seredynski, on May 12 as he dissects real-world security incidents in banking, food delivery, and e-commerce. From face verification bypass to location spoofing, we’re breaking down the anatomy of a breach and what teams can do differently to address them.
Dmytro Petrenko outlines seven principles for building consumer-grade Android SDKs, covering API design, thread safety, reactive state, and dependency isolation.
We reach out to more than 80k Android developers around the world, every week, through our email newsletter and social media channels. Advertise your Android development related service or product!
At Yazio, our product squads drive our mission to help people live healthier lives. We’re looking for a product-minded Senior Mobile Engineer to build impactful features for millions. You’ll work closely with Product, Engineering, and Design, using Kotlin Multiplatform to deliver for iOS & Android.
The excerpt emphasizes the importance of clarity and maintainability in designing APIs and methods in Microsoft .NET. It explains that while out parameters can be useful, they often lead to reduced readability and increased complexity. Their use should be limited to specific scenarios, particularly the Try pattern, for clearer, more maintainable code.
Checking whether a collection contains a specific item is a routine task in .NET, and with `Contains()` available on many collection types, it’s easy to assume they all perform similarly. In reality, the underlying data structure and search strategy make a dramatic difference, turning what looks like a simple lookup into a potential performance trap in frequently executed code paths. This article explores how different collections approach item searches, why those differences matter, and how making informed choices can lead to faster, more predictable, and more scalable applications.