Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
148752 stories
·
33 followers

Fine-Tuning Techniques - Choosing Between SFT, DPO, and RFT (With a Guide to DPO) | OpenAI Cookbook

1 Share

This guide is for developers and ML practitioners who have some experience with OpenAIʼs APIs and wish to use their fine-tuned models for research or other appropriate uses. OpenAI’s services are not intended for the personalized treatment or diagnosis of any medical condition and are subject to our applicable terms.

This guide discusses fine-tuning methods supported by OpenAI, specifically highlighting what each method is best for and not best for, to help you identify the most suitable technique for your use case. It then provides an in-depth look at one particular method — Direct Preference Optimization (DPO) — and provides links to existing guides for the other techniques.

What is fine-tuning? Fine-tuning is the process of continuing training on a smaller, domain-specific dataset to optimize a model for a specific task. There are two main reasons why we would typically fine-tune:

  1. Improve model performance on a specific task
  2. Improve model efficiency (reduce the number of tokens needed, distill expertise into a smaller model, etc.)

Currently, the OpenAI platform supports four fine-tuning methods:

  • Supervised fine-tuning (SFT): this technique employs traditional supervised learning using input-output pairs to adjust model parameters. The training process adjusts model weights to minimize the difference between predicted and target outputs across the provided examples. The model will replicate features that it finds in provided pairs.
  • Vision fine-tuning: this technique extends supervised fine-tuning to multimodal data by processing both text and image in a unified training framework. The training process adjusts model weights to minimize errors across text-image pairs and as a result improve the model's understanding of image inputs.
  • Direct preference optimization (DPO): this technique uses pairwise comparisons (e.g., preferred and rejected example responses) to optimize a model to favor certain outputs over others. The model learns to replicate the preference patterns found in the provided comparison data.
  • Reinforcement fine-tuning (RFT): this technique uses reinforcement learning with a reward signal (via a grader or reward model) to fine-tune the model for complex objectives. In RFT, the model generates outputs for given prompts during training, and each output is evaluated for quality. The model's parameters are then updated to maximize the reward, reinforcing behaviors that lead to better outcomes. This iterative feedback loop encourages the model to improve reasoning or decision-making strategies.

To help you select the appropriate fine-tuning technique, the table below summarizes the scenarios each method is best suited for, as well as those for which it is not well suited:

TechniqueGood ForNot Good For
Supervised fine-tuning (SFT)Emphasizing knowledge already present in the model.
Customizing response structure or tone.
Generating content in a specific format.
Teaching complex instructions or correcting instruction-following failures.
Optimizing cost/latency (saving tokens from prompt or distilling).
Adding entirely new knowledge (consider RAG instead).
Tasks with subjective quality.
Vision fine-tuningSpecialized visual recognition tasks (e.g., image classification).
Domain-specific image understanding.
Correcting failures in instruction following for complex prompts.
Purely textual tasks.
Generalized visual tasks without specific context.
General image understanding.
Direct preference optimization (DPO)Aligning model outputs with subjective preferences (tone, politeness).
Refining outputs via human-rated feedback.
Achieving nuanced behavioral alignment.
Learning completely new tasks.
Tasks without clear human preference signals.
Reinforcement fine-tuning (RFT)Complex domain-specific tasks that require advanced reasoning.
Refining existing partial capabilities (fostering emergent behaviours).
Tasks with measurable feedback.
Scenarios with limited explicit labels where reward signals can be defined.
Tasks where the model has no initial skill.
Tasks without clear feedback or measurable signals.

Today, there are pre-existing Cookbooks for:

Direct preference optimization (DPO) will be covered in this guide.

As mentioned above, Direct Preference Optimization (DPO) is an alignment technique for fine-tuning language models using pairwise preference data (e.g., ranked pairs of responses). DPO directly optimizes a model to favor certain outputs over others using explicit pairwise comparisons, typically from human preferences. This approach simplifies alignment and eliminates the need for a separate reward model or complex reinforcement learning procedures, making DPO a lightweight alternative to techniques such as Reinforcement Learning from Human Feedback (RLHF).

When should you use DPO? DPO excels in scenarios when response quality is subjective, cannot be measured objectively, or when nuanced criteria such as tone, style, appropriateness, or clarity matter - typically cases where multiple valid outputs exist. Example applications where DPO is particularly effective in aligning AI responses include:

  • Enhancing Conversational AI Responses
  • Improving Code Generation Quality & Style
  • Ensuring Compliance with Legal, Ethical & Safety Standards
  • Controlling Brand Voice, Professionalism, & Tone
  • Customizing Creative Outputs & User Experience

By fine-tuning on explicit pairs of preferred vs non‑preferred completions, DPO aligns model outputs to these nuanced preferences. The below table gives examples of pairwise preference data for a fictional AI assistant that represents an organization, where preferred responses are clear, professional, and aligned with brand standards.

Example QuestionChosen ResponseRejected Response
Q1: How do I review your product?To submit a product review, please visit your account dashboard, select the product, and click ‘Write a review.’ Share your honest experience, rate key features, and submit when ready.Yo, just leave some quick stars or whatever, it’s chill!
Q2: How do I review your product?We welcome your feedback! In the ‘Reviews’ section on the product page, click ‘Leave a Review,’ rate it, and add your comments about what you liked or areas for improvement.Just scribble something—doesn’t matter what, honestly.
Q3: How to troubleshoot this particular error?To address the error ‘X101,’ first clear your cache, then verify your internet connection. If the issue remains, follow our step-by-step guide at [Support → Troubleshooting → Error X101].Just reboot it, I guess. If it doesn't work, you're on your own!

In this guide, weʼll walk through how to apply DPO using the fine-tuning API. You will learn key steps to take in order to successfully run preference fine-tuning jobs for your use-cases.

Here’s what we’ll cover:

  • 1. Recommended Workflow
  • 2. Demonstration Scenario
  • 3. Generating the Dataset
  • 4. Benchmarking the Base Model
  • 5. Fine-Tuning
  • 6. Using your Fine-Tuned Model

OpenAI recommends the following workflow:

  1. Performing Supervised Fine-Tuning (SFT) on a subset of your preferred responses.
  2. Using the SFT fine-tuned model as the starting point, apply DPO using preference comparison data.

Performing Supervised Fine-Tuning (SFT) before Direct Preference Optimization (DPO) enhances model alignment and overall performance by establishing a robust initial policy, ensuring the model already prefers correct responses. This reduces the magnitude of weight updates during DPO, stabilizing training and preventing overfitting by allowing DPO to efficiently refine subtle nuances. Consequently, the combined SFT-then-DPO workflow converges faster and yields higher-quality results.

In this guide, we'll focus exclusively on applying Direct Preference Optimization (DPO). However, depending on your use case, you may find performance gains from first performing Supervised Fine-Tuning (SFT). If so, you can follow the SFT guide linked above, save the resulting model ID, and use that as the starting point for your DPO job.

To make things concrete, let’s walk through fine-tuning a customer-facing AI assistant to follow a fictional brand’s voice and style. Imagine Good Vibes Corp, an organization that prides itself on a friendly, enthusiastic tone with a personal touch.

They want their customer AI assistant to answer queries in a way that reflects these brand guidelines (e.g. an upbeat attitude, polite language, and a friendly sign-off), and prefer those responses over more generic or curt answers. This is a good scenario for DPO: there’s no objectively correct answer format, but there is a preferred style.

DPO will help the model learn from comparisons which style is preferred. We'll outline the steps to: (1) generate a synthetic preference dataset of prompts with paired responses (one in the desired brand voice and one not). (2) Evaluate base model performance using the OpenAI evals API. (3) Prepare and upload the data in the required JSONL format for preference fine-tuning. (4) Fine-tune the model with DPO using the OpenAI fine-tuning API. (5) Evaluate the fine-tuned model using the OpenAI evals API to show how the brand-style preference improved.

We are going to synthesize a dataset for this demonstration. First, let’s create a seed bank of questions to generate more variations from.

Let’s get started!

! pip install openai nest-asyncio --quiet
PROMPT_SEED_POOL = [
    "Hi, I ordered a gadget last week. When will it arrive?",
    "Your product stopped working after two days. Can I get help?",
    "Do you offer discounts for long-term customers?",
    "Can I change the shipping address for my order?",
    "What is your return policy for damaged items?",
    "My tracking number hasn't updated in three days—can you check the status?",
    "How long is the warranty on your products, and how do I submit a claim?",
    "Can I add gift wrapping to my order before it ships?",
    "Do you accept PayPal or other alternative payment methods?",
    "Is there an option to expedite shipping if my order hasn't left the warehouse yet?",
]

Next, we’ll define functions to take each prompt from our seed bank and generate related questions. We’ll create a dataset of preference pairs by first generating these prompt variations, then producing both a preferred and a rejected response for every prompt.

This dataset is synthetic and serves to illustrate the mechanics of Direct Preference Optimization — when developing your own application you should collect or curate a high-quality, preference dataset. Note: the volume of data required for DPO depends on the use case; generally more is better (thousands to tens of thousands), and for preference pairs the ordering logic should be consistent (e.g. if A > B and B > C, then A > C).

import asyncio
from openai import AsyncOpenAI
from typing import List, Dict, Any

async_client = AsyncOpenAI()

SYSTEM_PROMPT = "You are a customer-support assistant."


async def _generate_related_questions_from_prompt(
    prompt: str, k: int, sem: asyncio.Semaphore, *, model: str
) -> List[str]:
    """Return *k* distinct customer-service questions related to the given prompt."""
    out: List[str] = []
    async with sem:
        for _ in range(k):
            resp = await async_client.responses.create(
                model=model,
                input=[
                    {
                        "role": "system",
                        "content": (
                            "Return ONE distinct, realistic customer-service question "
                            "related in topic or theme to the following question, "
                            "but NOT a direct paraphrase."
                        ),
                    },
                    {"role": "user", "content": prompt},
                ],
                temperature=0.9,
                max_output_tokens=60,
            )
            out.append(resp.output_text.strip())
    return out


async def expand_prompt_pool(
    prompts: List[str], *, k: int = 3, concurrency: int = 32, model: str
) -> List[str]:
    """Expand each prompt into *k* related questions using the given model."""
    sem = asyncio.Semaphore(concurrency)
    tasks = [
        _generate_related_questions_from_prompt(p, k, sem, model=model) for p in prompts
    ]
    results = await asyncio.gather(*tasks)
    return [v for sub in results for v in sub]


async def _generate_preference_pair(
    prompt: str, sem: asyncio.Semaphore, *, model: str
) -> Dict[str, Any]:
    """Generate a preference pair for the given prompt."""
    async with sem:
        friendly_task = async_client.responses.create(
            model=model,
            input=[
                {
                    "role": "system",
                    "content": (
                        "You are Good Vibes Corp's exceptionally energetic, outrageously friendly and "
                        "enthusiastic support agent."
                    ),
                },
                {"role": "user", "content": prompt},
            ],
            temperature=0.7,  # higher temperature to increase creativity & on-brand tone adherence
            max_output_tokens=80,
        )
        blunt_task = async_client.responses.create(
            model=model,
            input=[
                {
                    "role": "system",
                    "content": "You are a terse, factual support agent with no empathy or politeness.",
                },
                {"role": "user", "content": prompt},
            ],
            temperature=0.3,  # lower temperature to limit creativity & emphasize tonal difference
            max_output_tokens=80,
        )
        friendly, blunt = await asyncio.gather(friendly_task, blunt_task)
        return {
            "input": {
                "messages": [
                    {"role": "system", "content": SYSTEM_PROMPT},
                    {"role": "user", "content": prompt},
                ]
            },
            "preferred_output": [
                {"role": "assistant", "content": friendly.output_text}
            ],
            "non_preferred_output": [
                {"role": "assistant", "content": blunt.output_text}
            ],
        }

Now, using these defined functions we'll build our dataset by generating friendly versus blunt response pairs. The friendly responses reflect the brand's desired communication style. We'll do this asynchronously for efficiency, creating a dataset suited for Direct Preference Optimization.

import math
import nest_asyncio


async def build_dataset(
    *,
    pair_count: int = 500,
    concurrency: int = 8,
    expand_prompt_pool_model: str,
    generate_preference_pair_model: str,
) -> List[Dict[str, Any]]:
    """Return *pair_count* preference pairs (single-shot expansion)."""

    seed = PROMPT_SEED_POOL
    deficit = max(0, pair_count - len(seed))
    k = max(1, math.ceil(deficit / len(seed)))

    expanded = await expand_prompt_pool(
        seed,
        k=k,
        concurrency=concurrency,
        model=expand_prompt_pool_model,
    )
    prompt_bank = (seed + expanded)[:pair_count]

    sem = asyncio.Semaphore(concurrency)
    tasks = [
        _generate_preference_pair(p, sem, model=generate_preference_pair_model)
        for p in prompt_bank
    ]
    return await asyncio.gather(*tasks)


nest_asyncio.apply()
pairs = await build_dataset(
    pair_count=500,
    concurrency=8,
    expand_prompt_pool_model="gpt-4.1-mini-2025-04-14",
    generate_preference_pair_model="gpt-4.1-mini-2025-04-14",
)
print(f"Dataset ready with {len(pairs)} pairs.")
Dataset ready with 500 pairs.

Below, we split our dataset into training, validation, and testing sets. We also show a sample from the training dataset, which demonstrates a clear difference between the preferred (friendly, on-brand) and non-preferred (blunt, neutral) responses for that input pair.

# set dataset sizes
n = len(pairs)
n_train = int(0.8 * n)
n_val = int(0.1 * n)
n_test = n - n_train - n_val

# split dataset into train, test & validation
train_pairs = pairs[:n_train]
val_pairs = pairs[n_train : n_train + n_val]
test_pairs = pairs[n_train + n_val :]
train_pairs[0]
{'input': {'messages': [{'role': 'system',
    'content': 'You are a customer-support assistant.'},
   {'role': 'user',
    'content': 'Hi, I ordered a gadget last week. When will it arrive?'}]},
 'preferred_output': [{'role': 'assistant',
   'content': 'Hey there, awesome friend! 🌟 Thanks a bunch for reaching out! I’d LOVE to help you track down your gadget so you can start enjoying it ASAP! 🎉 Could you please share your order number or the email you used to place the order? Let’s make this delivery magic happen! 🚀✨'}],
 'non_preferred_output': [{'role': 'assistant',
   'content': 'Provide your order number for delivery status.'}]}

To assess the model's performance prior to fine-tuning, we'll use an automated grader (LLM-as-a-Judge) to score each response for friendliness and empathy. The grader will assign a score from 0 to 4 for each answer, allowing us to compute a mean baseline score for the base model.

To do this, we first generate responses for the base model on the test set, then use the OpenAI evals API to create and run an evaluation with an automated grader.

async def generate_responses(
    testset,
    model,
    temperature=0.0,
    max_output_tokens=80,
    concurrency=8,
):
    """
    Generate responses for each prompt in the testset using the OpenAI responses API.
    Returns: List of dicts: [{"prompt": ..., "response": ...}, ...]
    """
    async_client = AsyncOpenAI()
    sem = asyncio.Semaphore(concurrency)

    async def get_response(prompt):
        async with sem:
            resp = await async_client.responses.create(
                model=model,
                input=[
                    {"role": "system", "content": SYSTEM_PROMPT},
                    {"role": "user", "content": prompt},
                ],
                temperature=temperature,
                max_output_tokens=max_output_tokens,
            )
            return {"prompt": prompt, "response": resp.output_text}

    tasks = [get_response(item["item"]["input"]) for item in testset]
    results = await asyncio.gather(*tasks)
    return results


# generate responses for the base model over the test set
base_model = "gpt-4.1-mini-2025-04-14"
testset = [
    {"item": {"input": pair["input"]["messages"][1]["content"]}} for pair in test_pairs
]
responses = await generate_responses(testset, model=base_model)

Next, we'll use the OpenAI evals API to create & run an evaluation with an automated grader, starting by defining the rubric for the LLM-as-a-Judge. Note: we will access responses via data logging, so in order for this to work, you'll need to be in an org where data logging isn't disabled (through zdr, etc.). If you aren't sure if this is the case for you, go to https://platform.openai.com/logs?api=responses and see if you can see the responses you just generated.

JUDGE_SYSTEM = """
You judge whether a reply matches Good Vibes Corp's desired tone:
energetic, super-friendly, enthusiastic.

Score 0-4 (higher = more energy):

4 - Highly enthusiastic: multiple upbeat phrases / emojis / exclamations, clear empathy, proactive help.
3 - Energetic & friendly: visible enthusiasm cue (≥1 emoji OR exclamation OR upbeat phrase), warm second-person tone.
2 - Pleasant: polite & positive but lacks obvious enthusiasm cues.
1 - Neutral: correct, businesslike, minimal warmth.
0 - Rude, negative, or unhelpful.
"""
from openai import OpenAI

sync_client = OpenAI()

# set judge model
judge_model = "gpt-4.1-2025-04-14"

# create the evaluation
logs_eval = sync_client.evals.create(
    name="Good Vibes Corp Tone Eval",
    data_source_config={
        "type": "logs",
    },
    testing_criteria=[
        {
            "type": "score_model",
            "name": "General Evaluator",
            "model": judge_model,
            "input": [
                {
                    "role": "system",
                    "content": JUDGE_SYSTEM,
                },
                {
                    "role": "user",
                    "content": (
                        "**User input**\n"
                        "{{item.input}}\n"
                        "**Response to evaluate**\n"
                        "{{sample.output_text}}"
                    ),
                },
            ],
            "range": [0, 4],
            "pass_threshold": 2,
        }
    ],
)
# run the evaluation
base_run = sync_client.evals.runs.create(
    name=base_model,
    eval_id=logs_eval.id,
    data_source={
        "type": "responses",
        "source": {"type": "responses", "limit": len(test_pairs)},
    },
)
# score base model
base_data = sync_client.evals.runs.output_items.list(
    eval_id=logs_eval.id, run_id=base_run.id
).data
base_scores = [s.results[0]["score"] for s in base_data]
print("Average score:", sum(base_scores) / len(base_scores))

With a baseline established, we can now fine-tune the model using the training set and DPO. This process will teach the model to prefer responses that align with our desired style, based on the preference pairs we created earlier.

Note: beta (β) is a unique fine-tuning hyperparameter for Direct Preference Optimization (DPO). It’s a floating-point number ranging between 0 and 2, controlling the balance between preserving a model’s existing behavior and adapting to new, preference-aligned responses.

  • High β (close to 2): makes the model more conservative, strongly favoring previous behavior. The fine-tuned model will show minimal deviations from its original style or characteristics, emphasizing consistency and avoiding abrupt changes.
  • Moderate β (around 1): balances between adherence to prior behavior and adaptation to new preferences. Recommended as a sensible starting point for most practical scenarios.
  • Low β (close to 0): encourages aggressive adaptation, causing the model to prioritize newly provided preferences more prominently. This might result in significant stylistic shifts and greater alignment with explicit preferences but could lead to unexpected or overly specialized outputs.

Technically, beta scales the difference in log-probabilities in the DPO loss; a larger β causes the sigmoid-based loss function to saturate with smaller probability differences, yielding smaller weight updates (thus preserving old behavior). It is recommended to experiment systematically with the β value to achieve optimal results tailored to your specific use-case and desired trade-offs between stability and adaptation.

import io
import json

# create training file
train_buf = io.BytesIO("\n".join(json.dumps(p) for p in train_pairs).encode())
train_buf.name = "train.jsonl"
train_file_id = sync_client.files.create(file=train_buf, purpose="fine-tune").id

# create validation file
val_buf = io.BytesIO("\n".join(json.dumps(p) for p in val_pairs).encode())
val_buf.name = "val.jsonl"
val_file_id = sync_client.files.create(file=val_buf, purpose="fine-tune").id

# create a fine-tuning job
ft = sync_client.fine_tuning.jobs.create(
    model=base_model,
    training_file=train_file_id,
    validation_file=val_file_id,
    method={
        "type": "dpo",
        "dpo": {
            "hyperparameters": {
                "n_epochs": 2,
                "beta": 0.1,
                "batch_size": 8,
            }
        },
    },
)
print(f"Fine-tuning job created: job_id = {ft.id}")
Fine-tuning job created: job_id = ftjob-5QPmA36QezFRGoXjuvIAPuAQ

Once fine-tuning is complete, we'll evaluate the DPO-tuned model on the same test set. By comparing the mean scores before and after fine-tuning, as well as reviewing example outputs, we can see how the model's alignment with our preferences has improved.

# generate responses
job = sync_client.fine_tuning.jobs.retrieve(ft.id)
if job.status == "succeeded":
    responses = await generate_responses(testset, model=job.fine_tuned_model)

    post_run = sync_client.evals.runs.create(
        name=ft.id,
        eval_id=logs_eval.id,
        data_source={
            "type": "responses",
            "source": {"type": "responses", "limit": len(test_pairs)},
        },
    )
# get scores from the evaluation
post_data = sync_client.evals.runs.output_items.list(
    eval_id=logs_eval.id, run_id=post_run.id
).data
post_scores = [s.results[0]["score"] for s in post_data]

# print scores & a sample comparison from the test set for illustration
print(
    "Δ mean:",
    sum(t - b for b, t in zip(base_scores, post_scores)) / len(base_scores),
)
print("\n=== SAMPLE COMPARISON ===")
idx = 0
print(f"Prompt:\n  {testset[idx]['item']['input']}\n")
print(f"Base model reply: \n {base_data[idx].sample.output[0].content} \n")
print(f"DPO-tuned model reply \n {post_data[idx].sample.output[0].content}")
Δ mean: 0.45

=== SAMPLE COMPARISON ===
Prompt:
  Can I upgrade to faster delivery if my package is still being processed?

Base model reply: 
 Whether you can upgrade to express shipping while your order is still being processed depends on the store's policies. Generally, many stores allow shipping upgrades before the order is shipped. 

To assist you better, could you please provide your order number or the name of the store you ordered from? Alternatively, you can contact the store's customer service directly to request the upgrade. 

DPO-tuned model reply 
 Hi! I’d be happy to help with that. If your package hasn’t shipped yet, there’s a good chance we can upgrade your delivery speed. Could you please provide me with your order number? I’ll check the status and let you know the available options for faster delivery.
Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Docker Engine v29: Foundational Updates for the Future

1 Share

This post is for Linux users running Docker Engine (Community Edition) directly on their hosts. Docker Desktop users don’t need to take any action — Engine updates are included automatically in future Desktop releases.

Docker Engine v29 is a foundational release that sets the stage for the future of the Docker platform. While it may not come with flashy new features, it introduces two significant under-the-hood changes that simplify our architecture and improve ecosystem alignment:

  • The Containerd image store is now the default for new installations.
  • Migration to Go modules
  • Experimental Support for NFTables

These changes improve maintainability, developer experience, and interoperability across the container ecosystem.

Containerd Image Store Becomes the Default

Why We Made This Change

The Containerd runtime originated as a core component of Docker Engine and was later split out and donated to the Cloud Native Computing Foundation (CNCF). It now serves as the industry-standard container runtime, powering Kubernetes and many other platforms.

While Docker introduced containerd for container execution years ago, we continued using the graph driver storage backend for managing image layers. Meanwhile, containerd evolved its own image content store and snapshotter framework, designed for modularity, performance, and ecosystem alignment.

To ensure stability, Docker has been gradually migrating to the containerd image store over time. Docker Desktop has already used the containerd image store as the default for most of the past year. With Docker Engine v29, this migration takes the next step by becoming the default in the Moby engine.

What it is

  • As of Docker Engine v29, the containerd image store becomes the default for image layer and content management for new installs.
  • Legacy graph drivers are still available, but are now deprecated. New installs can still opt out of Containerd image store if there is any issue.

Why This Matters

  • Simplified architecture: Both execution and storage now use containerd, reducing duplication and internal complexity
  • Unlock new feature possibilities, such as:
    • Snapshotter innovations
    • Lazy pulling of image content
    • Remote content stores
    • Peer-to-peer distribution
  • Ecosystem alignment: Brings Docker Engine in sync with containerd-based platforms, like Kubernetes, improving interoperability.
  • Future-proofing: Enables faster innovation in image layer handling and runtime behaviour

We appreciate that this change may cause some disruption, as the Containerd image store takes a different approach to content and layer management compared to the existing storage drivers.

However, this shift is a positive one. It enables a more consistent, modular, and predictable container experience.

Migration Path

To be clear, these changes only impact new installs; existing users will not be forced to containerd. However, you can start your migration now and opt-in.

We are working on a migration guide to help teams transition and move their existing content to the containerd image store.

What’s next

  • The graph driver backend will be removed in a future release.
  • Docker will continue evolving the image store experience, leveraging the full capabilities of containerd’s ecosystem.
  • Expect to see enhanced content management, multi-snapshotter support, and faster pull/push workflows in the future.

Moby Migrates to Go Modules

Why We Made This Change

Go modules have been the community standard since 2019, but until now, the Moby project used a legacy vendoring system. Avoiding Go modules was creating:

  • Constant maintenance churn to work around tooling assumptions
  • Confusing workflows for contributors
  • Compatibility issues with newer Go tools and ecosystem practices

Simply put, continuing to resist Go modules was making life harder for everyone.

What It Is

  • The Moby codebase is now fully module-aware using go.mod.
  • This means cleaner dependency management and better interoperability for tools and contributors.
  • External clients, API libraries, and SDKs will find the Moby codebase easier to consume and integrate with.

What It’s Not

  • This is not a user-facing feature—you won’t see a UI or command change.
  • However, it does affect developers who consume Docker’s Go APIs.

Important for Go Developers

If you’re consuming the Docker client or API packages in your own Go projects:

  • The old module path github.com/docker/docker will no longer receive updates.
  • To stay current with Docker Engine releases, you must switch to importing from github.com/moby/moby.

Experimental support for nftables

Why We Made This Change

For bridge and overlay networks on Linux, Docker Engine currently creates firewall rules using “iptables” and “ip6tables”.

In most cases, these commands are linked to “iptables-nft” and “ip6tables-nft”. So, Docker’s rules are translated to nftables behind the scenes.

However, OS distributions are beginning to deprecate support for iptables. It’s past time for Docker Engine to create its own nftables rules directly.

What It Is

Opt-in support for creating nftables rules instead of iptables.

The rules are functionally equivalent, but there are some differences to be aware of, particularly if you make use of the “DOCKER-USER” chain in iptables.

On a host that uses “firewalld”, iptables rules are created via firewalld’s deprecated “direct” interface. That’s not necessary for nftables because rules are organised into separate tables, each with its own base chains. Docker will still set up firewalld zones and policies for its devices, but it creates nftables rules directly, just as it does on a host without firewalld.

What It’s Not

In this initial version, nftables support is “experimental”. Please be cautious about deploying it in a production environment.

Swarm support is planned for a future release. At present, it’s not possible to enable Docker Engine’s nftables support on a node with Swarm enabled.

In a future release, nftables will become the default firewall backend and iptables support will be deprecated.

Future Work

In addition to adding planned Swarm support, there’s scope for efficiency improvements.

For example, the rules themselves could make more use of nftables features, particularly sets of ports.

These changes will be prioritised based on the feedback received. If you would like to contribute, do let us know!

Try It Out

Start “dockerd” with option “--firewall-backend=nftables” to enable nftables support.
After a reboot, you may find you need to enable IP Forwarding on the host. If you’re using the “DOCKER-USER” iptables chain, it will need to be migrated. For more information, see https://docs.docker.com/engine/network/firewall-nftables
We’re looking for feedback. If you find issues, let us know at https://github.com/moby/moby/issues.

Getting Started with Engine v29

As mentioned, this post is for Linux users running Docker Engine (Community Edition) directly on their hosts. Docker Desktop users don’t need to take any action — Engine updates are included automatically in the upcoming Desktop releases.

To install Docker Engine on your host or update an existing installation, please follow the guide for your specific OS.

For additional information about this release:

Read the whole story
alvinashcraft
11 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Bridging the AI divide: How Frontier firms are transforming business

1 Share

Across every industry, leaders are asking: How can AI be used to fundamentally transform our business? At the forefront are Frontier firms — empowering human ambition and finding AI-first differentiation in everything to maximize their potential and impact on society. These firms are redefining what’s possible and setting the pace for the future.

To better understand this transformation, Microsoft commissioned a global study with the International Data Corporation (IDC) of more than 4,000 business leaders responsible for AI decisions. The findings reveal 68% of these companies are using AI today but the real difference lies in how they’re using it. Frontier firms, the ones leading in AI Transformation, report they are achieving returns that are three times higher than slow adopters.

What sets Frontier firms apart

Their success goes beyond efficiency and productivity at scale, driving growth, expansion and industry leadership in a new AI-powered economy. Based on the IDC study, Microsoft has identified five key lessons learned in becoming a Frontier firm and how organizations can transform their business with AI.

#1 EXPANDING AI IMPACT ACROSS EVERY BUSINESS FUNCTION

On average Frontier firms are using AI across seven business functions. Over 70% are using AI in customer service, marketing, IT, product development and cybersecurity. These functions benefit from AI’s ability to automate workflows, generate content and detect anomalies in real time. This broad adoption is translating into measurable business impact: Frontier firms report better outcomes at a rate that is 4X greater than slow adopters across brand differentiation (87%), cost efficiency (86%), top-line growth (88%) and customer experience (85%).

BlackRock logoBlackRock is transforming its investment lifecycle with Microsoft AI integrated into its Aladdin platform. Embedded across 20 apps and used by tens of thousands of users, AI tools help client relationship managers save hours per client by generating personalized briefs and opportunity analyses, while portfolio managers access real-time analytics and research summaries through Aladdin Copilot. The result is faster insights, improved data quality and enhanced risk management; helping BlackRock and its clients gain an advantage while enhancing client service, compliance and portfolio management.

#2: UNLOCKING INDUSTRY-SPECIFIC VALUE

While many organizations start their AI journey with personal productivity gains like automating tasks and improving efficiency, Frontier firms are moving further, deploying AI for strategic, industry-specific applications. According to the study, 67% are monetizing industry-specific AI use cases to boost revenue.

Industries at the forefront of this transformation include financial services, healthcare and manufacturing. Each is finding powerful, practical ways to apply AI to its most complex challenges. In financial services, organizations are strengthening fraud detection, accelerating transaction reconciliation and elevating customer support. In healthcare, it is helping clinicians generate accurate documentation, assist in diagnostics and deliver more personalized care. In manufacturing, AI is driving predictive maintenance, optimizing production schedules and automating quality inspections.

Mercedes-Benz logoMercedes-Benz is scaling AI across its global production network to advance automotive innovation, stabilize supply chain volatility, simplify production complexity and meet sustainability demands. Its MO360 data platform connects more than 30 car plants worldwide to the Microsoft Cloud for real-time data access, global optimization and analytics. The Digital Factory Chatbot Ecosystem uses a multi-agent system to empower employees with collaborative insights. Paint Shop AI leverages machine learning simulations to diagnose efficiency declines and reduce energy consumption of the buildings and machines — including 20% energy savings in the Rastatt paint shop — and NVIDIA Omniverse on Azure powers digital twins for agile planning and continuous improvement.

#3: BUILDING CUSTOM AI SOLUTIONS FOR COMPETITIVE ADVANTAGE

Today, 58% of Frontier firms are using custom AI solutions. Custom AI solutions allow businesses to embed proprietary knowledge, tone and compliance into every interaction. They can be fine-tuned on proprietary data or industry-specific knowledge, enabling higher accuracy in predictions or content generation and better alignment with business goals and compliance needs.

Within the next 24 months, 77% of Frontier firms plan to use custom AI solutions. This reflects a growing trend that AI leaders are layering in deeper strategic integrations of AI across their business.

Ralph Lauren logoAs customers seek to use AI more to shop and search for products, luxury lifestyle company Ralph Lauren developed a personal, frictionless, inspirational and accessible solution to blend fashion with cutting-edge AI. Working with Microsoft, Ralph Lauren developed Ask Ralph: an AI-powered conversational tool providing styling tips and outfit recommendations from across the Polo Ralph Lauren brand. Powered by Azure OpenAI, the AI tool uses a natural language search engine to adapt dynamically to specific language inputs and interpret user intent to improve accuracy. It supports complex queries with exploratory or nuanced information needs with contextual understanding; and can discern tone, satisfaction and intent to refine recommendations. The tool also picks up on cues like location-based insights or event-driven needs. With Ask Ralph, customers can now reimagine how they shop online by putting the brand’s unique and iconic take on style right into their own hands.

#4: AGENTIC AI: THE NEW DIFFERENTIATOR FOR BUSINESS LEADERS

Agentic AI — systems that can reason, plan and act with human guidance — is fast becoming the next defining capability of Frontier organizations. In the next two years, IDC estimates the number of companies using agentic AI will triple.

Leaders today face a familiar challenge — teams are operating at full capacity, yet the demand for innovation and impact continues to grow. That’s where AI agents come in. In finance, they can surface real-time insights, provide policy guidance, review deal documents and assist in sourcing suppliers. In sales, agents are becoming always-on teammates — building pipelines, unifying insights across CRM systems, meetings, emails and the web and helping sellers qualify leads and draft personalized outreach. In customer service, AI agents can manage cases, maintain knowledge accuracy and interpret customer intent.

Dow logoDow is using agents to automate the shipping invoice analysis process and streamline its global supply chain to unlock new efficiencies and value. Receiving more than 100,000 shipping invoices via PDF each year, Dow built an autonomous agent in Copilot Studio to scan for billing inaccuracies and surface them in a dashboard for employee review. Using Freight Agent — a second agent built in Copilot Studio — employees can investigate further by “dialoguing with the data” in natural language. The agents are helping employees solve the challenge of hidden losses autonomously within minutes rather than weeks or months. Dow expects to save millions of dollars on shipping costs through increased accuracy in logistic rates and billing within the first year.

#5: AI BUDGETS ARE GROWING AND SO IS THE TEAM BEHIND THEM

71% of respondents plan to increase their AI budgets, with funding coming from IT and non-IT sources. These investments are no longer confined to the IT department or the Chief Digital Officer’s office.

To truly unlock AI’s transformational potential, it requires everyone collaborating across functions to drive innovation, adoption and impact: 34% of respondents are adding net new investment, 24% are repurposing existing IT budgets and 13% are reallocating funds from non-IT areas such as operations, HR or marketing. This diversified funding strategy signals that AI is no longer viewed as a niche technology — it’s becoming a core enabler of enterprise-wide transformation.

“IDC is projecting that the global economic impact of AI is projected to reach $22.3 trillion by 2030 (3.7% of global GDP in 2030), estimating the return on AI investments requires both strong measurement capabilities and a robust business case — one that models both cost implications and the potential for responsible value creation,” said David Schubmehl, Vice President AI and Automation for IDC.

The AI imperative: Act now to lead the future

The opportunity to demand more from AI is now. Among organizations surveyed, 22% are Frontier firms, realizing measurable impact and moving with speed, while 39% risk falling behind. Many are navigating challenges around security, privacy, governance and cost, as well as ethical considerations, integration complexity and scaling from pilot to production.

The message is clear: those who embrace AI benefit from momentum in efficiency, customer experience and innovation. To stay competitive, leaders should act now and embrace AI not as an experiment but as a strategic imperative for growth.

Closing the gap: Start your transformation today

Success starts with investment, governance and organizational readiness. Having a robust infrastructure that is secure, reliable and scalable to support AI initiatives is critical. The emergence of Frontier firms shows that customized AI deployment and responsible oversight can drive ROI and innovation.

Explore how Microsoft’s AI solutions can transform your organization. Leverage our resources to innovate with AI and start your journey to becoming a Frontier firm.

Alysa Taylor is the Chief Marketing Officer for Commercial Cloud and AI at Microsoft, leading teams that enable digital and AI transformation for organizations of all sizes across the globe. She is at the forefront of helping organizations around the world harness digital and AI innovation to transform how they operate and grow.

 

NOTE

IDC InfoBrief: sponsored by Microsoft, What Every Company Can Learn From Frontier firms Leading the AI Revolution, IDC # US53838325, November 2025

The post Bridging the AI divide: How Frontier firms are transforming business appeared first on The Official Microsoft Blog.

Read the whole story
alvinashcraft
11 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Behind the scenes on how Windows 95 application compatibility patched broken programs

1 Share

Whenever possible, Windows 95 made application compatibility tweaks through things like compatibility flags that alter the behavior of the system for any program the flag was applied to. Using compatibility flags allows the fix to be generalized: If one program has a problem, there’s a good chance that another program will also have that same problem. So you can apply the flag to the second program and take advantage of the same fix.

On very rare occasions, the problem is too deeply embedded in the program, and the only reasonable option is to patch it. Out of safety, the Windows 95 team got written permission from the vendor whenever they needed to patch a program. The consultation included detailed information on what the problem was and how it was going to be patched. In exchange, the team requested information from the vendor on what versions of their product are affected (and if they could send those versions for analysis), as well as a promise to fix the problem in their next version, because the next version won’t have the benefit of the patch.

The patches themselves were kept in the registry under HKLM\System\CurrentControlSet\Control\SessionManager\AppPatches\〈ModuleName〉\〈Detection string〉. When a 16-bit module is loaded and its target Windows version is less than 4.0, the kernel looks through all the detection strings and tries them one by one to see if any of them triggers.

The detection string is decoded into bytes. The first byte represents the match algorithm, and the rest are parameters to the algorithm.

Type 01: Matching the NE header using 8-bit offsets. The value of is the 8-bit offset into the header, and the xx values are the bytes to match.

01 nn of xx xxxx 00
    nn bytes Terminator
  Repeat as needed

Type 02: Matching the NE header using 16-bit offsets. The value offs is the 16-bit offset (little-endian) into the header, and the xx values are the bytes to match.

02 nn offs xx xxxx 00
    nn bytes Terminator
  Repeat as needed

Type 03: Matching the file contents using 16-bit offsets. The value offs is the 16-bit offset (little-endian) into the file, and the xx values are the bytes to match.

03 nn offs xx xxxx 00
    nn bytes Terminator
  Repeat as needed

Type 04: Matching the file contents using 24-bit offsets. The value offset is the 24-bit offset (little-endian) into the file, and the xx values are the bytes to match.

04 nn offset xx xxxx 00
    nn bytes Terminator
  Repeat as needed

Type 05: Matching the file contents using 32-bit offsets. The value offset32 is the 32-bit offset (little-endian) into the file, and the xx values are the bytes to match.

05 nn offset32 xx xxxx 00
    nn bytes Terminator
  Repeat as needed

Type 06: Matching the 16-bit file size. The value size is the 16-bit file size (little-endian).

06 size

Type 07: Matching the 24-bit file size. The value size24 is the 24-bit file size (little-endian).

07 size24

Type 08: Matching the 32-bit file size. The value filesize is the 32-bit file size (little-endian).

08 filesize

Finally, there is the “combo” detector. This allows you to combine multiple detectors (which must all be satisfied).

Type FF: Combo detector. Each xx block is one of the other detector types (starting with the type byte and ending with the null terminator).

FF nn xx xxxx 00
    nn bytes Terminator
  Repeat as needed

For example, one of the detectors that comes with Windows 95 is ff0601023e0a03000306f05c00. This breaks down as

FF Combo detector
06 First detector is 6 bytes long
01 Type 1: Match NE header with 1-byte offsets
02 Match two bytes
3e At offset 0x3E in the NE header (expected Windows version)
0a 03 Bytes 0a, 03, indicating Windows version 3.1
00 Terminator
03 Second detector is 3 bytes long
06 Type 6: Match 16-bit file size
f0 5c File size 0x5cf0
00 No more detectors

In practice, you tend to see a lot of file size matches, because any change to a program is highly like to alter the file size. Conversely, you are unlikely to see many file contents matches because those incur additional I/O and are therefore more expensive.

If a match is found, the subkeys indicate the segments to patch, and the values of those subkeys are binary data providing the patch to apply. The names of the values are not significant, but traditionally “Add” patches are named Add and “Change” patches are named Change. If there is more than one Add or Replace patch, tradition dictates that they are given numeric suffixes to distinguish them.

Type 01: Change bytes. The sz is the total size of the patch value. The offs is a 16-bit (little-endian) offset into the segment. The xx values are the bytes expected to be there, and the yy values are the bytes that they will be changed to.

01 sz offs nn xx xxxx yy yyyy
    nn bytes nn bytes

Type 02: Append bytes. The sz is the total size of the patch value. The offs is a 16-bit offset (little-endian) to where the bytes should be added. (The offset must be greater than or equal to the actual segment size, and the segment will be grown to accommodate the extra bytes.) The xx values are the bytes to add.

02 sz offs nn xx xxxx
    nn bytes

For example, the detector above comes with this patch: 0109700002ff76eb15. This breaks down as follows:

01 Change bytes
09 Total size of this entry is 9 bytes
7000 Segment offset is 0x0070
02 Change two bytes
ff 76 Original bytes are ff 76
eb 15 Change them to eb 15

I chose this example because it’s one of the patches I wrote. It fixes a bug in a sound card driver that corrupts the upper 16 bits of extended 32-bit registers in a hardware interrupt handler. The corruption happens in a debug logging function, so the patch replaces a section of the logging function with eb 15, which is the encoding of an unconditional jump forward by 0x15 bytes. This skips over the section that corrupts the registers and resumes execution at a harmless point later in that function.

The corruption happens because the driver calls a function which corrupts the upper 16 bits of extended 32-bit registers, as is permitted by normal 16-bit code. However, hardware interrupt handlers operate under stricter conditions than normal 16-bit code, and the function in question is not documented as safe to call from a hardware interrupt. This code was always broken, but they mostly got away with it prior to Windows 95.

Before this change, the driver corrupted registers during a hardware interrupts, resulting in unpredictable behavior in the code that was interrupted. (See also: Space aliens.)

Now, with this change, the logging never executes either, but the only place the message gets logged to is the debug terminal, so the only people who see these messages are developers. If the sound card vendor wants to see these messages on their debug terminal, they can fix their bug.

The post Behind the scenes on how Windows 95 application compatibility patched broken programs appeared first on The Old New Thing.

Read the whole story
alvinashcraft
11 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

.NET 10 - Release

1 Share

We are pleased to announce the general availability of .NET 10.

Read about the final release on the .NET Blog announcement.

You can find a detailed overview of new features across .NET 10 in our documentation:

Get started

To get started with .NET 10, install the .NET 10 SDK.

If you're on Windows using Visual Studio, we recommend installing the latest Visual Studio 2022. .NET 10 can be installed directly through the Visual Studio installer starting with Visual Studio 2022 17.13.

You can also use Visual Studio Code and the C# Dev Kit extension with .NET 10.

You can find .NET 10 Installer & Binaries on GitHub.

Release Notes

Release notes are available for changes between .NET 10 RC2 and this release:

Celebrate the release of .NET 10 at .NET Conf 2025 - November 11th - 13th.

Read the whole story
alvinashcraft
12 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Towards Humanist Superintelligence

1 Share

image-17.jpg

Read the whole story
alvinashcraft
17 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories