Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147403 stories
·
33 followers

Agents, OpenAI, deepfakes, and the messy reality of the AI boom: A conversation with Oren Etzioni

1 Share
Oren Etzioni, right, speaks with GeekWire’s Todd Bishop at an Accenture event about the future of agents. (GeekWire Photo / John Cook)

[Editor’s Note: Agents of Transformation is an independent GeekWire series and March 24, 2026 event, underwritten by Accenture, exploring the people, companies, and ideas behind AI agents.]

Oren Etzioni got so frustrated flipping between browser windows and following ChatGPT’s step-by-step instructions that he finally asked: Do you work for me, or do I work for you?

ChatGPT’s answer at the time: no, it couldn’t actually do the work for him. Etzioni, a computer scientist who has been building AI systems since the late 1980s, says filling that gap between AI talking and AI taking action is what defines this moment in the technology’s evolution.

But even as AI agents move from concept to reality, Etzioni says the “jagged edge” of functionality remains a stubborn problem: give an agent one request and it saves an hour and a half of work, then give it something nearly identical and it produces garbage.

“We haven’t achieved artificial reliability,” he said. “That’s still a ways off.”

Etzioni spoke with GeekWire at an event hosted by Accenture in Bellevue, Wash., last week, with an audience that included leaders from Microsoft. The University of Washington professor is co-founder of AI agent startup Vercept, founder and technical director of the AI2 Incubator, venture partner at Madrona, and former founding CEO of the Allen Institute for AI. 

Over the course of the evening, Etzioni fielded questions about the emerging landscape of AI agents, the platform competition among the major tech companies, China’s rise in AI research, and the evolving threat of deepfakes to democracy. He also offered some sharp words for OpenAI and advice for leaders navigating AI adoption.

On agents: Etzioni said what’s working now is delegating small, specific workflows — the kind of tasks that used to require flipping between apps and following instructions manually. 

Vercept, for example, lets an agent see what’s on your screen, find the buttons, read the text, and execute tasks directly, rather than relying on what he called the “rickety infrastructure” of APIs and web scraping that breaks whenever something changes.

The bigger picture is messier. Etzioni described Moltbook — the bot-only social network that attracted 1.6 million AI agents over a weekend — as overhyped in its current form, but a signal of what’s coming: a future where software agents interact with each other at scale.

He was blunt about the risks: Moltbook is a “security nightmare,” with agents running on users’ machines, accessing private information, and reading externally posted text that nobody controls, making them vulnerable to prompt injection attacks.

Etzioni pushed back on more dramatic framings of the moment, disagreeing with Microsoft AI CEO Mustafa Suleyman’s claim that we’re witnessing the birth of a new digital species. 

“These are still tools,” he said. “Power tools, but still tools working on our behalf.”

Oren Etzioni, right, speaks with GeekWire’s Todd Bishop at an event hosted by Accenture in Bellevue, Wash. (GeekWire Photo / John Cook)

On the platform competition: Asked how he sees the race among Microsoft, Google, Amazon, OpenAI, and Anthropic, he said he’d short OpenAI stock, if he did that sort of thing.

“They’re running around like a thousand chickens with their heads cut off,” he said, questioning whether the company has a coherent business model beyond its flagship chatbot. “Sure, they’re printing money on ChatGPT, but that’s not their business.”

He’s more bullish on Google, which he described as having the advantage of being vertically integrated, from chips to data to models to talent. “They start on the back foot,” he said, “but Google is poised to — I think the technical phrase is — kick their ass.”

Anthropic and OpenAI are racing toward IPOs as they burn through cash. Once they’re public, Etzioni noted, the quarter-by-quarter results will reveal who’s actually winning.

On China: Etzioni said the stereotype that the country’s AI work is derivative is no longer true. 

He pointed to research his team did at the Allen Institute for AI, tracking academic papers at top AI conferences, which showed Chinese papers rising not just in volume but in quality. That trend, he said, has played out in open-source models and technical innovation as well.

“I’m actually a China hawk — I’m very concerned about China’s role in the world,” he said. “But the solution is not to underestimate, because that would be a mistake.”

Oren Etzioni speaks to Accenture and Microsoft leaders. (GeekWire Photo / John Cook)

On deepfakes: Etzioni spent more than a year running TrueMedia.org, a nonprofit he founded to build tools for newsrooms and fact-checkers to detect deepfakes in the lead-up to the 2024 elections. The good news, he said, is that deepfakes didn’t significantly change election outcomes. The bad news is that the technology has gotten much cheaper and easier to deploy.

Looking forward, he’s concerned about a “denial of democracy attack” — not a single viral deepfake but thousands of AI agents flooding congressional, school board, and mayoral races with coordinated fake media at a scale that current detection systems can’t handle.

“The last war, which was 2024, we won,” he said. “The next war is coming.”

On leadership: Etzioni said AI adoption is not something leaders can delegate to a CIO or general counsel. His three pieces of advice: 

  • Use AI yourself, whether you’re the CEO or the janitor. 
  • Build incentive structures that encourage your team to experiment with it. 
  • Don’t just use AI to do existing work faster; look for things only possible with AI.

“The real gold,” he said, “is when you’re getting AI to do new things that we just didn’t do before.”

Listen to the full conversation on the GeekWire Podcast above, or subscribe to GeekWire in Apple Podcasts, Spotify, or wherever you listen.

Read the whole story
alvinashcraft
42 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

AI Agent & Copilot Podcast: TMC CEO Jen Harris on Building the Partner of the Future

1 Share

In this episode of the AI Agent & Copilot Podcast, John Siefert, host and CEO, Dynamic Communities and Cloud Wars, is joined by Jen Harris, CEO of TMC, to explore how AI agents, automation, and mindset shifts are redefining business. Their discussion spans TMC’s acquisition of TMG, leadership in the partner ecosystem, and why reimagining work is critical now, setting the stage for conversations at the 2026 AI Agent & Copilot Summit NA.

Key Takeaways

  • AI Requires Commitment, Not Caution: Harris emphasizes that half-measures slow progress more than they reduce risk. Organizations that just try one thing often abandon AI too quickly because early results aren’t perfect. She notes, “You fail first at new things,” adding that true adoption requires patience, leadership backing, and a willingness to accept short-term discomfort for long-term gains.
  • Solutions Beat Technology Stacks: Customers no longer want disconnected tools; they want outcomes. Harris explains that clients expect partners to “meet them where they are,” combining Power Platform, Azure, data, and AI into real solutions.
  • Mindset Is the Real Bottleneck: While AI is already embedded in daily life, Harris observes resistance when it enters core business roles. “It’s not quite here yet” is often code for fear of job impact. She challenges leaders to reframe AI as a workload reducer, asking, “What if it would make you less busy?”
  • Reactive Roles Are Disappearing: Harris highlights a coming shift as agents take over repetitive, reactive work. Professionals who built careers on being indispensable specialists must evolve. People will move toward proactive creation, strategy, and value generation.
  • Human Connection Still Matters: Despite rapid automation, Harris stresses that humanity isn’t going away. Reflecting on in-person events, she says, “Look at you — you came out of your offices on a cold day, and we’re talking.” AI may scale intelligence, but trust, inspiration, and shared understanding still comes from people.

AI Agent & Copilot Summit is an AI-first event to define opportunities, impact, and outcomes with Microsoft Copilot and agents. Building on its 2025 success, the 2026 event takes place March 17-19 in San Diego. Get more details.

The post AI Agent & Copilot Podcast: TMC CEO Jen Harris on Building the Partner of the Future appeared first on Cloud Wars.

Read the whole story
alvinashcraft
42 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Moderna Says FDA Refuses To Review Its Application for Experimental Flu Shot

1 Share
An anonymous reader shares a report: The Food and Drug Administration has refused to start a review of Moderna's application for its experimental flu shot, the company announced Tuesday, in another sign of the Trump administration's influence on tightening vaccine regulations in the U.S. Moderna said the move is inconsistent with previous feedback from the agency from before it submitted the application and started phase three trials on the shot, called mRNA-1010. The drugmaker said it has requested a meeting with the FDA to "understand the path forward." Moderna noted that the agency did not identify any specific safety or efficacy issues with the vaccine, but instead objected to the study design, despite previously approving it. The company added that the move won't impact its 2026 financial guidance. Moderna's jab showed positive phase three data last year, meeting all of the trial goals. At the time, Moderna said the stand-alone flu shot was key to its efforts to advance a combination vaccine targeting both influenza and Covid-19.

Read more of this story at Slashdot.

Read the whole story
alvinashcraft
43 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Is anyone using AI for good?

1 Share
In a world where AI is replacing human workers, using up energy and water, and deepening disconnect, is AI for humanitarian good even possible? The answer is yes. In the first part of this two part series, we're taking a look at just a few AI do-gooders and what they're doing to fight climate change, make healthcare more accessible, and help their communities.
Read the whole story
alvinashcraft
43 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Organizational Strategies from the Collective Wisdom of Nature

1 Share

Circa 2016, a logistics company was drowning. Their centralized routing system—the kind most enterprises still use—couldn’t keep pace with millions of daily deliveries. Managers were making routing decisions through layers of approval. Response time measured in hours. In ecommerce, that’s death.

Then they did something counterintuitive.

Instead of building a smarter central command, they dismantled it. Thousands of delivery drivers were told: Take the shortest available route you see, avoid congested zones, coordinate with your neighbors. Ignore the central system if it makes sense to ignore it.

The first week was chaos. Drivers felt unmoored. They’d been trained for years to defer to authority. But the second week, something shifted. Drivers started talking to each other, sharing what they learned. Within months, delivery times dropped 15%. Fuel costs fell 12%. And the system became more resilient to disruptions, not less.

The system reflected one of nature’s key coordination models: swarm intelligence—the collective behavior of thousands of simple agents following basic local rules and producing astonishingly sophisticated global outcomes. Of course, this wasn’t swarm intelligence in the purist sense. It was something more practical: centralized optimization with locally adaptive execution. The routes were still computed by HQ algorithms, but drivers had authority to deviate based on what they observed. This hybrid model—plan centrally, execute locally—outperformed purely rigid centralization.

This matters. Because most organizations make a different mistake: They assume every problem requires either total central control or total decentralization. The reality is more nuanced. Still, the point remains: The natural world offers a range of coordination models your organization can learn from as you address the specific challenges you face.

The Nature That Shaped Intelligence

A leafcutter ant colony doesn’t have a CEO. No board meetings. No quarterly planning sessions. Yet somehow thousands of ants coordinate to strip trees and farm fungus underground with mind-bending efficiency.

A school of fish doesn’t vote on which direction to swim when a predator appears. No consensus process. Each fish simply watches its three nearest neighbors, maintains distance, and matches speed. They move as one organism.

A flock of birds migrates thousands of miles without a navigator. No GPS. No preplanned route. Each bird follows the same three rules: Stay close to your neighbors, don’t collide with them, and match their speed. Somehow they arrive.

But nature has other models too. Bees don’t swarm to find food. They use waggle dances—a signal system where scouts communicate location and quality to the hive, and the colony collectively decides where to forage. This is collective decision-making, not swarm behavior.

Here are some of the models nature uses, and how you might employ them in your organization:

Ant colonies (pheromone-based swarms): Individual ants are cognitively simple. They follow chemical trails. They don’t strategize. They don’t discuss. Coordination emerges from simple stimulus-response rules repeated at scale. This is perfect for routing algorithms. Humans? Not so much. We have language. We overthink. We have egos and agendas.

Bird flocks (proximity-based synchronization): Each bird watches its nearest neighbors and maintains distance, alignment, and speed. This produces coordinated movement without central direction. It’s useful for thinking about organizational synchronization, but in practice, knowledge workers don’t coordinate through proximity. They coordinate through explicit communication.

Bee colonies (collective decision-making via signaling): Scouts find food sources and perform waggle dances; the hive collectively decides where to forage. There’s communication. There’s collective judgment. This maps better to humans—we make decisions through voting, consensus, or appointed authority structures. But we do this through language, not dance.

Small human groups (language-based coordination): Humans naturally work in intimate groups of 5–15 people. We communicate directly. We debate. We explain reasoning. We build trust through repeated interaction. This is our strength. Research on military special forces, surgical teams, and startup founding teams shows that this scale consistently outperforms larger hierarchies for complex, novel work.

Here’s what matters for organizations: Not all nature-inspired coordination is the same, and not all models suit human knowledge work equally well. The mistake organizations make is mixing these models. Trying to run a board decision through swarm logic doesn’t work. Trying to route 10,000 deliveries through consensus doesn’t work. Match the model to the problem.

How to Distribute Intelligence (Without Creating Chaos)

The winning organizations distribute decision-making by problem type, not by ideology. Start by asking yourself “What type of decision is this?”

Optimization problem with clear goals and frequent repetition?

Consider swarm-inspired algorithms or distributed rules. Routing, scheduling, resource allocation. These benefit from parallel exploration and adaptation.

Define simple, transparent local rules. “Always choose the shortest queue” is better than “we’ve determined you should do this.” Transparency builds trust. It enables agents to adapt rules as conditions change.

Establish clear boundaries. Swarms aren’t lawless. Even the most autonomous ant colony operates within biological constraints. Similarly, decentralized decision-making needs guardrails: budget limits, compliance rules, service-level agreements, ethics boundaries. These constraints prevent harmful emergence while preserving autonomy.

Measure emergent patterns. Are teams naturally clustering around customer segments? Are response times improving faster than expected? Are deviations from planned rules creating better outcomes? These patterns reveal whether the system is actually adaptive or just chaotic.

Measure emergent patterns. Are teams naturally clustering around customer segments? Are response times improving faster than expected? Are deviations from planned rules creating better outcomes? These patterns reveal whether the system is actually adaptive or just chaotic.

Want Radar delivered straight to your inbox? Join us on Substack. Sign up here.

Repeated execution with local knowledge advantage?

Delegate authority. A store manager sees local demand before HQ does. A nurse sees treatment patterns before epidemiologists do. Give them authority to act within clear boundaries.

Make authority explicit. People need to know: What can I decide? What requires escalation? Build trust through transparency. And create communication channels. Small teams work because people talk directly. Don’t remove that advantage by adding layers.

Small-group knowledge work or novel problem?

Use small teams with explicit communication. Strategy, product direction, customer account decisions. These need debate, judgment, and reasoning—not local rules.

Preserve hierarchy for initial deliberation, then push decisions down. A CEO might set direction (“we’re entering this market”), but let teams decide execution. Mix levels of control by decision phase.

Then build the culture progressively. Organizations where power has been tightly held resist decentralization. Pilot in bounded domains. A single supply chain. One customer segment. A specific operational challenge. Demonstrate value. Build credibility. Then expand. And yes, this requires middle managers to give up control. Most companies fail here. If you aren’t ready to fire managers who hoard decisions, don’t bother trying to decentralize. They’ll sabotage it.

Strategic or ethical choice?

Humans in a room. These require deliberation, trust-building, and explicit reasoning. You can’t swarm your way through a values decision.

Markets shift faster than executives can perceive. Customer preferences change in real time. Disruptions emerge from nowhere. The solution is to distribute intelligence by matching decision type to coordination mechanism. Nature figured out multiple coordination models. Organizations should too.

Real-World Applications

Smart cities: Traffic signal timing in Copenhagen and Singapore uses distributed coordination. Instead of a central traffic control room synchronizing all lights, intersections coordinate based on local congestion. Signals adjust in real time to vehicle flow. This is closer to true swarm behavior—local rules, no central command, emergent global optimization. The result: reduced congestion and lower emissions.

Healthcare: Diagnostic systems using AI aggregate insights across thousands of clinicians. This is distributed sensing. Every clinician is an antenna. The algorithm learns from patterns observed across all of them simultaneously. Drug discovery accelerates as algorithms explore molecular spaces too vast for sequential testing. This works because the goal is clear (better diagnosis, faster discovery), and local information (what clinicians observe) accumulates into global patterns.

Financial services: Real-time algorithmic trading uses multiple agents executing strategies based on local market signals. Agents respond to local conditions without central coordination. But notice: This only works because the goal is crystal clear (maximize return) and the environment is well-defined (market data). Try this for strategic investment decisions and you’ll have chaos.

Energy systems: Power grid management in renewable-heavy systems uses swarm-like coordination to balance supply and demand in real time. Distributed generators respond to local price signals. Consumers adjust consumption based on local grid conditions. This approximates true swarm behavior because the problem is optimization at scale (balance supply and demand) without central planning.

Small teams with autonomy: Amazon’s two-pizza teams have authority to build and deploy independently. Netflix engineers can deploy code without centralized approval gates. Southwest Airlines gate agents make refund decisions on the spot. These are delegated authority structures—not swarms but genuinely autonomous decision-making. They work because the team is small enough for direct communication, the authority boundaries are clear, and the decisions are nonstrategic (execution choices, not direction).

The Competitive Advantage: Speed Through Matching Model to Problem

Organizations that match coordination model to problem type will outpace competitors trapped in binary thinking (all centralized or all decentralized). The advantage isn’t technological. The algorithms are known. The models are established. The advantage is structural clarity. Companies that can identify problem type, choose the right coordination mechanism, and execute without paralysis will move faster.

This mirrors natural evolution. Ant colonies didn’t succeed because they invented new biology. They succeeded because they used the right coordination model for the problem (optimization at scale). Humans didn’t dominate because we swarm. We dominated because we combine small-group collaboration with individual reasoning. Organizations following the same principle—using the right model for the right problem—will emerge as leaders.

For you as a business leader, the question isn’t whether to adopt distributed thinking. Markets are pushing you there. The question is: What types of decisions should be distributed, and what types should stay centralized? And crucially: What coordination mechanism actually suits each type? Your organization’s intelligence doesn’t live solely in the executive suite. It lives in frontline employees, customer interactions, market data, and operational feedback. The companies winning are the ones learning to access it.

But distributed decision-making isn’t one thing. It’s multiple things—swarms for optimization, delegation for execution, small teams for strategy, humans for judgment. Nature has already shown you multiple models. The only question is whether you’ll use the right one for the right problem.

One size doesn’t fit all.



Read the whole story
alvinashcraft
43 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The Best AI Models for Coding: Accuracy, Integration, and Developer Fit

1 Share

AI models and coding assistants have become essential tools for developers. Today, developers rely on large language models (LLMs) to accelerate coding, improve code quality, and reduce repetitive work across the entire development lifecycle. From intelligent code completion to refactoring, debugging, and documentation, AI-powered tools are now embedded directly into daily workflows.

Drawing on insights from the latest JetBrains Developer Ecosystem Report 2025, this guide compares the top large language models (LLMs) used for programming. It focuses on how leading LLMs balance accuracy, speed, security, cost, and IDE integration, helping developers and teams choose the right model for their specific needs.

Throughout the article, we also highlight how tools like JetBrains AI Assistant bring these models directly into professional development environments, backed by real-world usage data from the report.

Please note that the models listed in the article reflect those available during the research period, and may not reflect the most recent versions.

Table of contents

  • What are AI models for coding?
  • How developers choose between AI models
  • Top AI models in 2025
  • Evaluation criteria for AI coding assistants
  • Open-source vs. proprietary models
  • Enterprise readiness and security
  • How to select the right AI coding model for you
  • FAQ
  • Conclusion

What are AI models for coding?

AI models for coding are large language models (LLMs) trained on vast collections of source code, technical documentation, and natural language text. Their purpose is to understand programming intent and generate relevant, context-aware responses that assist developers during software creation. Unlike traditional static tools, these models can reason about code structure, explain logic, and adapt to different programming languages and frameworks.

The best LLMs for programming support a wide range of everyday development tasks, most typically being used for code completion, refactoring, debugging, documentation writing, and test creation. By delegating such repetitive or boilerplate-related tasks to an LLM, developers can turn their attention to more complex problem-solving and system design tasks.

Most developers interact with AI coding tools through IDE integrations, browser tools, or APIs. This is where IDE-based assistants, such as JetBrains AI Assistant, are particularly valuable, as they operate directly within the development context, using project structure, files, and language semantics to improve accuracy and relevance.

The use of AI coding tools is influenced by several critical factors, including accuracy, latency, cost efficiency, and data privacy. According to the JetBrains Developer Ecosystem Report 2025, AI adoption was increasingly widespread, with up to 85% of developers regularly using AI tools for coding and development in 2025.

As AI capabilities expand, developers face an important challenge: selecting those AI models that best fit their workflow. The next section discusses how developers can evaluate the various options and make the best decision for their needs.

How developers choose between AI models

Developers’ adoption of AI coding tools in 2025 was driven by how well an AI model integrated into real-world workflows and delivered consistent output. This often goes beyond technical specs alone and involves various practical and trust-based factors. 

The top concern identified in the JetBrains Developer Ecosystem Report 2025 was code quality. IDE integration was another major priority. AI tools for developers that work seamlessly inside familiar environments, such as JetBrains IDEs, are far more likely to be adopted than standalone interfaces. Pricing and licensing also mattered for developers, especially for individual developers and small teams who need predictable or affordable access.

For professional teams, data privacy and security increasingly shape decision-making around AI model selection. The ability to control how prompts and code are processed, whether models can be deployed locally, and how data is retained or logged are all critical considerations. Customization options, including fine-tuning and contextual prompts, are also becoming more relevant as teams seek domain-specific optimization.

Overall, insights from the report indicated a clear divide. Individual developer AI preferences prioritize usability, responsiveness, and cost efficiency. But for organizations, the principal focus areas were compliance, governance, and long-term scalability.

Key selection factors for AI coding assistants

This table summarizes the core criteria developers use for quick comparison.

CriterionWhy it mattersHow to assess
Code qualityDetermines whether generated code is correct, maintainable, and consistent with best practicesEvaluate accuracy and reasoning in real coding scenarios
IDE integrationAffects workflow continuity and adoption rateCheck for native support in JetBrains IDEs or other editors
Price and licensingInfluences accessibility for individuals and teamsCompare pricing tiers, free limits, and scalability costs
Data privacy and securityEnsures that code and prompts are handled safelyVerify local execution, encryption, and data policy
Local or self-hosted optionsImportant for teams with compliance or IP control needsAssess support for private model deployment
Fine-tuning and customizationEnables domain-specific improvements and internal optimisationCheck whether the model supports custom training or contextual prompts

With these criteria in mind, the next section explores the top AI models developers used in 2025 and how they compare in practice.

Top AI models used in 2025

The JetBrains Developer Ecosystem Report 2025 showed that developers did not rely on a single LLM. Instead, they used a small set of the best AI models for coding, depending on accuracy needs, workflow integration, cost constraints, and data-handling requirements.

Based on developer survey data, the report identified the following AI models in 2025 as the most commonly used and trusted for coding tasks. It forms the basis of an AI coding assistants comparison guide that is grounded in real-world adoption rather than theoretical benchmarks:

GPT models (OpenAI): Models like GPT-5 and GPT-5.1 were widely used and recognized as some of the best LLMs for programming in day-to-day development, particularly for code generation, refactoring, and explanation tasks. These models were incorporated in daily workflows due to their consistent output quality and large context windows. Their trade-off is cost, especially for teams with heavy usage.

Claude models (Anthropic): Claude 3.7 Sonnet was commonly chosen by developers working with large files, monorepos, or documentation-heavy projects. It was frequently cited among top AI code assistants for its ability to reason over long inputs and maintain structure in explanations and generated code. However, compared to GPT-based tools, it offers fewer native integrations.

Gemini (Google): Gemini 2.5 Pro appeared most often in workflows tied to Google’s ecosystem. Developers reported using it for tasks that combine coding with documentation, search, or collaborative environments. While it performed well in speed and accessibility, it is less flexible for teams that require deep customization or private deployments when evaluating AI models in 2025.

DeepSeek: DeepSeek R1 gained attention among developers seeking lower-cost AI coding assistance or local deployment options. It is increasingly included in AI coding assistant comparisons for teams experimenting with AI at scale while maintaining tighter control over data and infrastructure.

Open-source models: These models, such as Qwen and StarCoder, represented another category of best LLMs for programming for a smaller but growing segment of developers. They are most popular among teams with strong DevOps capabilities or strict data-governance requirements. While they offer maximum control, they also require significant operational effort.

Overall, differences in reasoning accuracy, speed, context length, and IDE integration significantly influenced developer preferences when selecting among the best AI models for coding. All of these impacted developer preferences. For instance, some developers prioritized performance and reasoning depth with GPT-4o or Claude 3.7. Others chose more cost-efficient or private alternatives, such as DeepSeek and open-source models, depending on workflow and organizational constraints.

Capabilities of leading AI models for coding

ModelDeployment modelConfig / InterfaceBest forStrengthTrade-off
GPT-5 / GPT-5.1Cloud / APIText + code inputBroad coding and reasoning tasksHigh accuracy and large contextHigher cost per token
Claude 3.7 SonnetCloud / APINatural language focusStructured code and documentationContextual reasoning, long input handlingLimited tool integrations
Gemini 2.5 ProCloudMultimodal, Google ecosystemWeb-based workflowsFast response, cloud collaborationLimited fine-tuning
DeepSeek R1Cloud / LocalAPI and SDKCost-efficient large-scale codingCompetitive performance, local optionSmaller ecosystem
Open-source models (Qwen, StarCoder, etc.)Local / Self-hostedVariousPrivacy-first or custom useControl, modifiabilitySetup complexity, maintenance
Disclaimer: models listed reflect those available at the time the research concluded and may not represent the most recent versions.

Pricing and total cost of ownership (TCO) comparison

Model typeCost profileScaling considerations
GPT familyUsage-based, higher per-token costScales well but requires budget planning
Claude familyUsage-based, mid-to-high costEfficient for long-context tasks
GeminiBundled cloud pricingOptimized for cloud environments
DeepSeekLower usage costsAttractive for frequent queries
Open-sourceInfrastructure-dependentNo license fees, higher ops cost

The next section builds on this by presenting a clear framework for objectively evaluating these models.

Evaluation criteria for AI coding assistants

Selecting an AI coding assistant requires balancing multiple factors rather than optimizing for a single metric, a reality reflected in any meaningful comparison of AI coding assistants. Accuracy, speed, cost, integration, and security all play a role, and their relative importance depends on whether the tool is used for personal productivity, enterprise compliance, or research and experimentation when identifying the best AI for software development.

Developers surveyed in the JetBrains Developer Ecosystem Report 2025 consistently cited code accuracy and IDE integration as top priorities when evaluating LLMs. However, organizational users also emphasized governance, transparency, and scalability as part of a broader AI model assessment.

Core evaluation criteria for AI coding assistants

CriterionWhy it mattersHow to assess
Accuracy and reasoningDetermines the reliability of code suggestions, explanations, and test generationCompare model output on real codebases or benchmark problems
Integration and workflow fitEnsures smooth adoption inside IDEs and CI/CD pipelinesVerify compatibility with JetBrains IDEs, VS Code, or API connectors
Cost and scalabilityAffects accessibility for individual and organizational usersReview token pricing, API quotas, or enterprise licensing
Security and data privacyProtects proprietary code and complies with organizational standardsCheck data retention policies, encryption, and local deployment options
Context length and memoryImpacts how well the model understands complex projects or filesEvaluate maximum input size and conversational continuity.
Customization and fine-tuningEnables adaptation to specific domains or internal librariesDetermine whether the model allows prompt tuning, embeddings, or private training
Transparency and governanceImportant for auditability and complianceConfirm whether logs, audit trails, and explainability tools are available

These criteria underscore a fundamental choice developers must make between open-source and proprietary AI models, discussed in the next section.

Open-source vs. proprietary models

AI coding assistants generally fall into two categories: open-source or locally deployed models and commercial, cloud-managed models. A choice between them affects everything from data handling to performance and maintenance.

The JetBrains Developer Ecosystem Report 2025 showed that most developers currently rely on cloud-based proprietary AI coding tools, but a growing segment preferred local or private deployments due to security and compliance requirements. This group increasingly turned to local LLMs for coding and leveraged open-source models.

General industry patterns, specifically when it comes to a comparison of AI platforms, suggest there are different reasons behind this choice. Teams that choose open-source AI models for coding often seek transparency, customization, and infrastructure control. Proprietary models, on the other hand, offer faster onboarding, reliability, and vendor-managed updates.

While there is no single “best” option, the selection of either an open-source or proprietary model comes down to organizational priorities such as compliance, scalability, and available DevOps resources. The following comparison table summarizes each type’s advantages, limitations, and best-fit scenarios.

Comparison of open-source and proprietary AI coding models

TypeAdvantagesLimitationsBest fit
Open-source / Local models (e.g. StarCoder, Qwen, DeepSeek Local)Full control of infrastructure and data, ability to customize and fine-tune, no recurring license feesRequires setup and maintenance effort; updates and security are handled internally; performance may depend on local hardwareTeams with strong DevOps capabilities or strict data-governance requirements
Proprietary / Managed models (e.g., GPT-5, Claude 3.7, Gemini Pro)Fast setup, robust integrations, vendor-handled compliance, predictable performance, and enterprise supportCosts scale with usage; potential vendor lock-in; less transparency in training dataIndividual developers and growing teams focused on speed and reduced operational overhead
Disclaimer: models listed reflect those available at the time of research conclusion and may not represent the most recent versions.

Now that we have explored the various models open to developers, we will examine enterprise-readiness and security and consider how organizations evaluate governance, compliance, and reliability when adopting AI coding solutions.

Enterprise readiness and security

Enterprise AI coding tools must meet requirements far beyond accuracy or productivity gains. Security, compliance, and governance also play a decisive role.

According to the JetBrains Developer Ecosystem Report 2025, many companies hesitated to adopt AI coding tools due to concerns about data privacy, IP protection, and model transparency.  These need to be addressed to ensure secure AI for developers.

To achieve this, enterprise-ready AI models typically offer flexible deployment, role-based access control, encryption, audit logs, policy enforcement, and AI governance and compliance.

Some tools, such as JetBrains AI Assistant, support both cloud and on-premises integration, which suits teams that need a balance between agility and compliance. The table below also summarizes the capabilities and example tools required to create enterprise-ready LLMs.

Enterprise evaluation matrix for AI coding tools

CapabilityWhy it mattersExample tools
Deployment flexibilityEnterprises need to control where data and models run to meet compliance and integration requirementsTeamCity, JetBrains AI Assistant (self-hosted), GitLab, DeepSeek Local
Role-based access control (RBAC) and SSOCentralizes identity management and reduces risk of unauthorized accessJetBrains AI Assistant, Harness, GitLab
Audit and traceabilitySupports compliance with ISO, SOC, and internal governance auditsTeamCity, Jenkins (plugins), JetBrains AI Assistant
Policy as code / ApprovalsEnables automated enforcement of deployment and review policiesHarness, GitLab, TeamCity
Data privacy and encryptionProtects source code and proprietary data during inference or storageJetBrains AI Assistant, Claude 3.7 (enterprise), DeepSeek Local
Disaster recovery and backupsMinimizes downtime and preserves continuity in case of system failuresJetBrains Cloud Services, GitLab Self-Managed
Compliance standardsEnsures alignment with SOC 2, ISO 27001, GDPR, or regional equivalentsJetBrains AI Assistant, GitLab, Harness

Now that we understand how to create an enterprise evaluation matrix, the next section will explain how teams can choose the right AI coding model based on their specific needs, balancing control, speed, and compliance.

How to select the right AI coding model for you

The best AI for developers depends on context. They must balance control, cost, integration, and compliance to find the best LLM for team workflows, rather than look for a single winner.

As you have seen, each model is suited to meet specific needs, be they speed, governance, or flexibility. This 8-step selection framework will guide you on how to find the right AI coding model for your requirements when choosing an AI assistant.

Step-by-step selection framework

StepQuestionIf “yes” →If “no” →
1Need full data control or on-premises security?Use local or self-hosted models (DeepSeek Local, Qwen, open-source)Continue
2Primarily using JetBrains IDEs?Use JetBrains AI Assistant (supports multiple LLMs)Continue
3Need a model optimized for GitHub workflows?Choose GPT-4o or GitHub CopilotContinue
4Require large context handling for complex codebases?Claude 3.7 Sonnet or Gemini 2.5 ProContinue
5Need cost efficiency for frequent queries?DeepSeek R1 or open-source alternativesContinue
6Require enterprise compliance (RBAC, SSO, audit logs)?JetBrains AI Assistant, Harness, or GitLabContinue
7Prefer minimal setup and fast onboarding?Managed cloud models (GPT-4o, Claude, Gemini)Continue
8Working with multi-language or monorepo projects?JetBrains AI Assistant or GPT-4o.Continue
Disclaimer: models listed reflect those available at the time of research conclusion and may not represent the most recent versions.

Summary takeaways

How to choose an AI coding model:

  • Need control → local or open-source models
  • Need speed → GPT or Claude
  • Need compliance → JetBrains AI Assistant
  • Focus on collaboration → IDE-integrated tools

Align your tool choice with your team’s priorities.

Now that you have the right AI coding model, in the next section, we will answer the most common developer questions about AI coding tools.

FAQ

Q: Which AI model was most popular among developers in 2025?
A: GPT-4o, Claude 3.7 Sonnet, and Gemini 2.5 Pro were the most frequently used AI models for coding tasks, according to the JetBrains Developer Ecosystem Report 2025.

Q: Are there free or affordable AI models for coding?
A: Yes. DeepSeek R1 and open-source models like Qwen or StarCoder provide cost-efficient options for developers exploring AI assistance.

Q: Which AI coding tools integrate best with JetBrains IDEs?
A: JetBrains AI Assistant integrates multiple LLMs, including GPT and Claude models, directly into IDE workflows for real-time suggestions and contextual understanding.

Q: Is it safe to use AI coding tools for proprietary projects?
A: Yes, if using tools with strong data privacy policies or local execution options. Many teams adopt private or on-premises models to retain full control of source code.

Q: What’s the difference between cloud and local AI models?
A: Cloud models offer convenience and scalability, while local or self-hosted models provide greater data control and compliance for enterprise use.

Q: Which AI model is best for enterprise environments?
A: Enterprise-ready tools like JetBrains AI Assistant, Claude for Teams, and Harness provide features such as RBAC, audit logs, and SSO for secure governance.

Q: How widely are AI tools adopted among developers?
A: As seen in data shared from the JetBrains Developer Ecosystem Report 2025 earlier, more than two-thirds of professional developers used some form of AI coding assistance, reflecting strong industry-wide adoption.

The next section will summarize the key insights and encourage readers to explore JetBrains AI tools for their own development workflows.

Conclusion

AI coding models have moved from experimentation to everyday development practice. Developers now rely on AI assistants to write, review, and understand code at scale. GPT, Claude, Gemini, and DeepSeek lead the field, while open-source and local options continue to gain traction for privacy and customization.

The JetBrains Developer Ecosystem Report 2025 found that there is no single best AI model for coding. The right choice depended on workflow, team size, and governance requirements. As AI-assisted development evolves, improvements in reasoning, context length, and IDE integration will further shape how developers build software with AI’s help.

To experience these capabilities firsthand, start exploring AI-powered development today. Learn more about JetBrains AI Assistant, and see how it can enhance your development workflow.

Read the whole story
alvinashcraft
43 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories