Content Developer II at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
127733 stories
·
29 followers

What's new with AI on Windows 11 version 24H2 (2024 Update) | Windows Central

1 Share

If you're not into AI, you probably won't like this guide.

Read the whole story
alvinashcraft
5 hours ago
reply
West Grove, PA
Share this story
Delete

Scratch that! We’re actually no wiser about when Microsoft plans to release the Windows 11 24H2 update

1 Share
For those who are keenly awaiting the release of the Windows 11 24H2 update, a recent Microsoft blog post caused a good deal of excitement. It appeared to reveal that this significant feature update is due to roll out this very month; but all was not as it seems Microsoft has now updated the blog post to clarify that the information it includes has been misinterpreted -- or perhaps that it was not sufficiently clearly written in the first place. Where does this leave us? See also: It is easy to see why the article generated such interest. The blog… [Continue Reading]
Read the whole story
alvinashcraft
12 hours ago
reply
West Grove, PA
Share this story
Delete

Go further, faster with the Swift Career Accelerator

1 Share

Unleash your full potential as a Swift developer with the all-new Swift Career Accelerator: the most comprehensive, career-transforming learning resource ever created for iOS development.

Whether you’re just starting out, looking to land your first job, or aiming to become a lead developer, this program offers everything you need to level up – from mastering Swift’s latest features to conquering interview questions and building robust portfolios.

So, if you're ready for a guided journey that will elevate your skills and accelerate your career, read on…

Your complete pathway

The Swift Career Accelerator aims to give you a complete pathway to take your career forward, no matter what your level is right now.

It does this by bringing together for the first time the world's largest collection of Swift development resources in one place, then carefully organizes it across five career stages so that everyone at every level has something to take them forward…

Level 1: Kickstart your career and land your first job

If you’re just starting out, Level 1 is designed to help you build the foundation for a successful career in Swift development. Along with my popular 100 Days of SwiftUI and Ultimate Portfolio App courses, you’ll gain exclusive access to my brand-new Take Home Test course. You’ll master essential skills like Git source control and dive into critical data structures like queues, stacks, and trees.

But that’s just the beginning. You’ll also get a wealth of new interview questions with expert answers, step-by-step guidance on crafting a standout resume, and practical tips for launching your first apps on the App Store. Everything you need to land that first job and start building your future is right here.

Level 2: Strengthen your skills and advance as a developer

Once you've found your first job, it's time to build on that foundation and take your skills to the next level. Here you'll move on to more detailed topics such as **gen...



Read the whole story
alvinashcraft
12 hours ago
reply
West Grove, PA
Share this story
Delete

Optimizing Models: Fine-Tuning, RAG and Application Strategies

1 Share

Before diving in, let's take a moment to review the key resources and foundational concepts that will guide us through this blog. That will ensure we're well-equipped to follow along. This brief review will provide a strong starting point for exploring the main topics ahead.

  • Microsoft Azure: Microsoft offers a cloud computing platform and a suite of cloud services. It provides a wide range of cloud-based
    services and solutions that enable organizations to build, deploy, and manage applications and services through Microsoft's global network of data centers. 
  • AI Studio: a platform that helps you evaluate model responses and orchestrate prompt application components with prompt flow for better performance. The platform facilitates scalability for transforming proof of concepts into full-fledged production with ease, continuous monitoring and refinement support long-term success. 

HadilBENAmor_17-1726450105901.png

 

 

  • Fine-tuning: is the process of retraining pretrained models on specific datasets. The purpose is typically to improve model performance on specific tasks or to introduce information that wasn't well represented when you originally trained the base model.
  • Retrieval Augmented Generation (RAG): is a pattern that works with pretrained large language models (LLM) and your own data to generate responses. In Azure Machine Learning, you can implement RAG in a prompt flow.

Our hands-on learning will be developing an AI-based solution that helps the user extract financial information and insights from investment/finance books and newspaper in our database.

The process is divided into three main parts:

  • Fine-tune a base model with financial data to help the model provide more specific responses and be grounded and rooted with data related to finance and investment.
  • Implement RAG so that the response won’t be only based on the data it was trained with (fine-tuned with) but also based on other data sources (the user’s input in our case).
  • Integration of the deployed model into a web app so that it could be used through a user interface.

 

1- Setup:

  1. Create a resource group which is defined as a container that holds related resources for an Azure solution. The resource group can include all the resources for the solution, or only those resources that you want to manage as a group.HadilBENAmor_18-1726450105902.png
    You need to specify your subscription, a unique resource group name, and the region.HadilBENAmor_19-1726450105907.png
  2. Create an Azure OpenAI resource: Azure OpenAI Service provides REST API access to OpenAI's powerful language models including GPT-4o, GPT-4 Turbo with Vision, GPT-4, GPT-3.5-Turbo, and Embeddings model series. These models can be easily adapted to your specific task including but not limited to content generation, summarization, image understanding, semantic search, and natural language to code translationHadilBENAmor_20-1726450105914.png
    Note: If you think of deploying or finetuning a specific model, please check the model's availability and create your Azure OpenAI resource for that region. - Create a text embedding model: the embedding is an information-dense representation of the semantic meaning of a piece of text. Each embedding is a vector of floating-point numbers, such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format. HadilBENAmor_21-1726450105921.png
  3. Create an AI search resource: Azure AI Search ("Azure Cognitive Search" previously) provides secure information retrieval at scale over user-owned content in traditional and generative AI search applications. Information retrieval is foundational to any app that surfaces text and vectors. Common scenarios include data exploration, and increasingly feeding query results to prompts based on your proprietary grounding data for conversational search as we will do in our example.
    HadilBENAmor_22-1726450105924.png
  4. Create a storage account: it contains all your Azure Storage data objects: blobs, files, queues, and tables. The storage account provides a unique namespace for your Azure Storage data that is accessible from anywhere in the world over HTTP or HTTPS.
    HadilBENAmor_23-1726450105930.png
    Note: Locally redundant storage (LRS) replicates your storage account three times within a single data center in the primary region. LRS provides at least 99.999% durability of objects over a given year. LRS is the lowest-cost redundancy option and offers the least durability compared to other options. For Azure students’ subscription for example, this choice is the most cost-effective. - Create a blob container: blob Storage is Microsoft's object optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data. it will be used to store your data.
    Navigate to your storage resource -> Click on Storage browser tab on the left -> Click Blob Containers -> Click on + add container then Upload your data. Our data was pdf files (books and newspapers) and csv files from Kaggle, all are related to finance and investment.
    HadilBENAmor_24-1726450105937.pngHadilBENAmor_25-1726450105946.png
  5. Create a search Index: is your searchable content, available to the search engine for indexing, full text search, vector search, hybrid search, and filtered queries. Check that the status of your AI search is "Running".
    - Import and vectorize data: integrated vectorization is an extension of the indexing and query pipelines in Azure AI Search. It adds the following capabilities: Data chunking (split the data into smaller, manageable pieces) during indexing. Text-to-vector conversion during indexing.
    Navigate to you AI search service -> Click on Indexes tab on the left -> Click on "Import and Vectorize data" -> Select your text embedding model previously deployed.HadilBENAmor_26-1726450105952.pngHadilBENAmor_27-1726450105956.png

 

 

2- Fine Tune a Base Model

  • Preparing Dataset: before starting fine tuning, we need to prepare our training and validation data. Which could be collected manually or generated from text files or tabular data. However, it is important that the data confirm the following template:

 

 

{"messages": [{"role": "system", "content": "content goes here"}, {"role": "user", "content": "query goes here?"}, {"role": "assistant", "content": "response goes here."}]}.

 

 


To satisfy this, we prepared our two data sets training_data.jsonl and val_data.jsonl for training and validation, respectively.

Note: JSONL is a text-based format using the .jsonl file extension that is basically the same as JSON format but implemented using newline characters to separate JSON values. It is also known as a JSON line.  It is recommended to have at least 50 high-quality examples to start out. However, it's entirely possible to have a use case that might require thousands of high-quality training examples to be successful. Low-quality data will affect the model’s performance.

Below is an example of our training_data.jsonl:

 

 

 

{"messages": [{"role": "system", "content": "You are a helper in finance and an advisor in investment"}, {"role": "user", "content": "What makes a successful startup?"}, {"role": "assistant", "content": "Luck, hard work, and maybe skills... mostly luck, hard work, and consistency."}]}

 

 

Both data files are attached to this blog. They were collected manually through some examples.

  • Evaluate data to ensure its quality, check number of tokens and its distribution.
    import json import tiktoken import numpy as np from collections import defaultdict encoding = tiktoken.get_encoding("cl100k_base") def num_tokens_from_messages(messages, tokens_per_message=3, tokens_per_name=1): num_tokens = 0 for message in messages: if not isinstance(message, dict): print(f"Unexpected message format: {message}") continue num_tokens += tokens_per_message for key, value in message.items(): if not isinstance(value, str): print(f"Unexpected value type for key '{key}': {value}") continue num_tokens += len(encoding.encode(value)) if key == "name": num_tokens += tokens_per_name num_tokens += 3 return num_tokens def num_assistant_tokens_from_messages(messages): num_tokens = 0 for message in messages: if not isinstance(message, dict): print(f"Unexpected message format: {message}") continue if message.get("role") == "assistant": content = message.get("content", "") if not isinstance(content, str): print(f"Unexpected content type: {content}") continue num_tokens += len(encoding.encode(content)) return num_tokens def print_distribution(values, name): if values: print(f"\n#### Distribution of {name}:") print(f"min / max: {min(values)}, {max(values)}") print(f"mean / median: {np.mean(values)}, {np.median(values)}") print(f"p5 / p95: {np.quantile(values, 0.05)}, {np.quantile(values, 0.95)}") else: print(f"No values to display for {name}") files = [ r'train_data.jsonl', r'val_data.jsonl' ] for file in files: print(f"Processing file: {file}") try: with open(file, 'r', encoding='utf-8') as f: total_tokens = [] assistant_tokens = [] for line in f: try: ex = json.loads(line) messages = ex.get("messages", []) if not isinstance(messages, list): raise ValueError("The 'messages' field should be a list.") total_tokens.append(num_tokens_from_messages(messages)) assistant_tokens.append(num_assistant_tokens_from_messages(messages)) except json.JSONDecodeError: print(f"Error decoding JSON line: {line}") except ValueError as ve: print(f"ValueError: {ve} - line: {line}") except Exception as e: print(f"Unexpected error processing line: {e} - line: {line}") if total_tokens and assistant_tokens: print_distribution(total_tokens, "total tokens") print_distribution(assistant_tokens, "assistant tokens") else: print("No valid data to process.") print('*' * 50) except FileNotFoundError: print(f"File not found: {file}") except Exception as e: print(f"An unexpected error occurred: {e}") ​
  • Login to AI Studio
  • Navigate to the Fine-tuning tab
  • Check the available models for fine-tuning within your region. Please make sure to have enough quota available. Not enough quota may result in the non-availability of the model when exceeding the number of limited tokens and may also result in slower response (high latency). HadilBENAmor_28-1726450105962.png

     

  • Upload your training and validation data
    HadilBENAmor_29-1726450105967.png
    Since we have our data locally, we uploaded them. In case you want to save your data in the cloud and use the URL for later in place of the "Uploading files" option, you can use SDK and follow this code:

 

 

# Initialize AzureOpenAI client client = AzureOpenAI( azure_endpoint=azure_oai_endpoint, api_key=azure_oai_key, api_version=version # Ensure this API version is correct ) training_file_name = r'path’ validation_file_name = r'path’ try: # Upload the training dataset file with open(training_file_name, "rb") as file: training_response = client.files.create( file=file, purpose="fine-tune" ) training_file_id = training_response.id print("Training file ID:", training_file_id) except Exception as e: print(f"Error uploading training file: {e}") try: # Upload the validation dataset file with open(validation_file_name, "rb") as file: validation_response = client.files.create( file=file, purpose="fine-tune" ) validation_file_id = validation_response.id print("Validation file ID:", validation_file_id) except Exception as e: print(f"Error uploading validation file: {e}")

 

 

 

  • You can specify the hyperparameters such as batch size, or leave them with default values.
  • Review the settings before submitting
    HadilBENAmor_30-1726450105971.png

     

  • Check the status of the fine-tuning in your dashboard, changing from Queued to Running to Completed.
  • Once completed, your fine-tuned model is ready to be deployed. Click on ‘Deploy’
  • After successful deployment, you can go back to Azure Open AI and find your fine-tuned model deployed along with your previous text embedding model.

    HadilBENAmor_31-1726450105982.png

     

 

3- Integration into Web App

The concept here is to rely on the model's knowledge + users’ documentation. We have two options and both provide high precision for responses:

  • Look for the answer in the documents, and if not found, return a response based on the internal knowledge of the model. 
  • Combine the two responses from the retriever and the model. Which is the one we opt for here.

Also, for integration, we have two ways we may follow: through the Azure OpenAI User Interface and deploying into an Azure static web app or develop your own web app and use the Azure SDK to integrate your model.

1- Deploying into Azure static web app

  • Click on "Open in Playground" below your deployments list in Azure open AI
  • Click "Add your data"
  • Choose your Azure blob storage as data source à Choose Index name "myindex"
  • Customize the system message to "You are a financial advisor and an expert in investment. You have access to a wide variety of documents. Use your own knowledge to answer the question and verify it or supplement it using the relevant documents when possible." This system message will enable the model not only to rely on documents but also rely on its internal knowledge.
  • Complete the setup and click on "Apply changes"
    HadilBENAmor_32-1726450105994.pngHadilBENAmor_33-1726450106013.png
  • Deploy to a new web app and configure the web app name, subscription, resource group, location, and pricing plan.

2- Develop your own web App and use Azure SDK

  • Prepare your environment
    load_dotenv () azure_oai_endpoint = os.getenv("AZURE_OAI_FINETUNE_ENDPOINT2") azure_oai_key = os.getenv("AZURE_OAI_FINETUNE_KEY2") azure_oai_deployment = os.getenv("AZURE_OAI_FINETUNE_DEPLOYMENT2") azure_search_endpoint = os.getenv("AZURE_SEARCH_ENDPOINT") azure_search_key = os.getenv("AZURE_SEARCH_KEY") azure_search_index = os.getenv("AZURE_SEARCH_INDEX") ​
  • Initialize your AzureOpenAI client

 

 

client = AzureOpenAI( base_url=f"{azure_oai_endpoint}/openai/deployments/{azure_oai_deployment}/extensions", api_key=azure_oai_key, api_version="2023-09-01-preview)

 

 

 

  • Configure your data source for Azure AI search. This will retrieve response from our stored files.  

 

 

extension_config = dict( dataSources= [ { "type": "AzureCognitiveSearch", "parameters": { "endpoint": azure_search_endpoint, "key": azure_search_key, "indexName": azure_search_index, } } ] )

 

 

 

Note: When you implement RAG, make sure that the chat response won’t rely solely on it. You may get a response such as “This information is not available in your data source,” which indicates that the model is based only on searching your data and did not provide answers from the data it was trained with to generate suitable responses. 

RAG is used to enhance a model's capabilities by adding more grounded information, not to eliminate the model’s internal knowledge.

Some issues that you may face during development:

  • Issue 1: make sure to verify the OpenAI version. You can pin the version to openai=0.28 or upgrade it and follow migration steps.
  • Issue 2: you may run out of quota and be asked to wait for 24 hours till the next try. Make sure to always have enough quota in your subscription.

Next, you can look at how to do real-time injection so that you personalize more of the responses. Try to find how to rely between your web app, the user's input I/O, the searching index, and LLM.
Keyword: Langchain, Databricks

 

Resources:

Read the whole story
alvinashcraft
12 hours ago
reply
West Grove, PA
Share this story
Delete

The Future of AI: Fine-Tuning Llama 3.1 8B on Azure AI Serverless, why it's so easy & cost efficient

1 Share

The Future of AI: LLM Distillation just got easier

Part 2 - Fine-Tuning Llama 3.1 8B on Azure AI Serverless

How Azure AI Serverless Fine-tuning, LoRA, RAFT and the AI Python SDK are streamlining fine-tuning of domain specific models. (🚀🔥 Github recipe repo).

 

By Cedric Vidal, Principal AI Advocate, Microsoft

Part of the Future of AI 🚀 series initiated by Marco Casalaina with his Exploring Multi-Agent AI Systems blog post.

 

cedricvidal_0-1726692301841.png

AI-powered engine fine-tuning setup, generated using Azure OpenAI DALL-E 3

 

In our previous blog post, we explored utilizing Llama 3.1 405B with RAFT to generate a synthetic dataset. Today, you’ll learn how to fine-tune a Llama 3.1 8B model with the dataset you generated. This post will walk you through a simplified fine-tuning process using Azure AI Fine-Tuning as a Service, highlighting its ease of use and cost efficiency. We’ll also explain what LoRA is and why combining RAFT with LoRA provides a unique advantage for efficient and affordable model customization. Finally, we’ll provide practical, step-by-step code examples to help you apply these concepts in your own projects. > The concepts and source code mentioned in this post are fully available in the Github recipe repo.

 

Azure AI takes the complexity out of the equation. Gone are the days when setting up GPU infrastructure, configuring Python frameworks, and mastering model fine-tuning techniques were necessary hurdles. Azure Serverless Fine-Tuning allows you to bypass the hassle entirely. Simply upload your dataset, adjust a few hyperparameters, and start the fine-tuning process. This ease of use democratizes AI development, making it accessible to a wider range of users and organizations.

Why Azure AI Serverless Fine-Tuning Changes the Game

Fine-tuning a model used to be a daunting task:

  1. Skill Requirements: Proficiency in Python and machine learning frameworks like TensorFlow or PyTorch was essential.
  2. Resource Intensive: Setting up and managing GPU infrastructure required significant investment.
  3. Time-Consuming: The process was often lengthy, from setup to execution.

Azure AI Fine-Tuning as a Service eliminates these barriers by providing an intuitive platform where you can fine-tune models without worrying about the underlying infrastructure. With serverless capabilities, you simply upload your dataset, specify hyperparameters, and hit the “fine-tune” button. This streamlined process allows for quick iterations and experimentation, significantly accelerating AI development cycles.

 

cedricvidal_1-1726692301987.png

Llama relaxing in a workshop, generated using Azure OpenAI DALL-E 3

LoRA: A Game-Changer for Efficient Fine-Tuning

What is LoRA?

LoRA (Low-order Rank Adaptation) is an efficient method for fine-tuning large language models. Unlike traditional fine-tuning, which updates all the model’s weights, LoRA modifies only a small fraction of the weights captured in an adapter. This focused approach drastically reduces the time and cost needed for fine-tuning while maintaining the model’s performance.

LoRA in Action

LoRA fine-tunes models by selectively adjusting a small fraction of weights via an adapter, offering several advantages:

  • Selective Weight Updating: Only a fraction of the weights are fine-tuned, reducing computational requirements.
  • Cost Efficiency: Lower computational demands translate to reduced operational costs.
  • Speed: Fine-tuning is faster, enabling quicker deployments and iterations.

cedricvidal_2-1726692301990.png

Illustration of LoRA Fine-tuning. This diagram shows a single attention block enhanced with LoRA. Each attention block in the model typically incorporates its own LoRA module. SVG diagram generated using Azure OpenAI GPT-4o

Combining RAFT and LoRA: Why It’s So Effective

We’ve seen how Serverless Fine-tuning on Azure AI uses LoRA, which updates only a fraction of the weights of the model and can therefore be so cheap and fast.

 

With the combination of RAFT and LORA, the model is not taught new fundamental knowledge, indeed it becomes an expert at understanding the domain, focusing its attention on the citations that are the most useful to answer a question but it doesn’t contain all the information about the domain. It is like a librarian (see RAG Hack session on RAFT), a librarian doesn’t know the content of all the books perfectly, but it knows which books contain the answers to a given question.

 

Another way to look at it is from a standpoint of information theory. Because LoRA only updates a fraction of the weights, there is only so much information you can store in those weights as opposed to full weight fine tuning which updates all the weight bottom to top of the model.

 

LoRA might look like a limitation but it’s actually perfect when used in combination with RAFT and RAG. You get the best of RAG and fine-tuning. RAG provides access to a potentially infinite amount of reference documents and RAFT with LoRA provides a model which is an expert at understanding the documents retrieved by RAG at a fraction of the cost of full weight fine-tuning.

Azure AI Fine-Tuning API and the Importance of Automating your AI Ops Pipeline

Azure AI empowers developers with serverless fine-tuning via an API, simplifying the integration of fine-tuning processes into automated AI operations (AI Ops) pipelines. Organizations can use the Azure AI Python SDK to further streamline this process, enabling seamless orchestration of model training workflows. This includes systematic data handling, model versioning, and deployment. Automating these processes is crucial as it ensures consistency, reduces human error, and accelerates the entire AI lifecycle—from data preparation, through model training, to deployment and monitoring. By leveraging Azure AI’s serverless fine-tuning API, along with the Python SDK, organizations can maintain an efficient, scalable, and agile AI Ops pipeline, ultimately driving faster innovation and more reliable AI systems.

Addressing Model Drift and Foundation Model Obsolescence

One critical aspect of machine learning, especially in fine-tuning, is ensuring that models generalize well to unseen data. This is the primary purpose of the evaluation phase.

 

However, as domains evolve and documents are added or updated, models will inevitably begin to drift. The rate of this drift depends on how quickly your domain changes; it could be a month, six months, a year, or even longer.

 

Therefore, it’s essential to periodically refresh your model and execute the distillation process anew to maintain its performance.

Moreover, the field of AI is dynamic, with new and improved foundational models being released frequently. To leverage these advancements, you should have a streamlined process to re-run distillation on the latest models, enabling you to measure improvements and deploy updates to your users efficiently.

Why Automating the Distillation Process is Essential

Automation in the distillation process is crucial. As new documents are added or existing ones are updated, your model’s alignment with the domain can drift over time. Setting up an automated, end-to-end distillation pipeline ensures that your model remains current and accurate. By regularly re-running the distillation, you can keep the model aligned with the evolving domain, maintaining its reliability and performance.

Practical Steps: Fine-Tuning Llama 3.1 8B with RAFT and LoRA

Now that we’ve explained the benefits, let’s walk through the practical steps using the raft-distillation-recipe repository on GitHub.

If you have not yet run the synthetic data generation phase using RAFT, I invite you to head over the previous article of this blog series.

 

Once you have your synthetic dataset on hand, you can head over to the finetuning notebook of the distillation recipe repository.

Here are the key snippets of code illustrating how to use the Azure AI Python SDK to upload a dataset, subscribe to the Markerplace offer, create and submit a fine-tuning job on the Azure AI Serverless platform.

Uploading the training dataset

The following code checks if the training dataset already exists in the workspace and uploads it only if needed. It incorporates the hash of the dataset into the filename, facilitating easy detection of whether the file has been previously uploaded.

 

 

 

 

from azure.ai.ml.entities import Data dataset_version = "1" train_dataset_name = f"{ds_name}_train_{train_hash}" try: train_data_created = workspace_ml_client.data.get(train_dataset_name, version=dataset_version) print(f"Dataset {train_dataset_name} already exists") except: print(f"Creating dataset {train_dataset_name}") train_data = Data( path=dataset_path_ft_train, type=AssetTypes.URI_FILE, description=f"{ds_name} training dataset", name=train_dataset_name, version=dataset_version, ) train_data_created = workspace_ml_client.data.create_or_update(train_data) from azure.ai.ml.entities._inputs_outputs import Input training_data = Input( type=train_data_created.type, path=f"azureml://locations/{workspace.location}/workspaces/{workspace._workspace_id}/data/{train_data_created.name}/versions/{train_data_created.version}" )

 

 

 

 

Subscribing to the Marketplace offer

This step is only necessary when fine-tuning a model from a third party vendor such as Meta or Mistral. If you’re fine-tuning a Microsoft first party model such as Phi 3 then you can skip this step.

 

 

 

 

from azure.ai.ml.entities import MarketplaceSubscription model_id = "/".join(foundation_model.id.split("/")[:-2]) subscription_name = model_id.split("/")[-1].replace(".", "-").replace("_", "-") print(f"Subscribing to Marketplace model: {model_id}") from azure.core.exceptions import ResourceExistsError marketplace_subscription = MarketplaceSubscription( model_id=model_id, name=subscription_name, ) try: marketplace_subscription = workspace_ml_client.marketplace_subscriptions.begin_create_or_update(marketplace_subscription).result() except ResourceExistsError as ex: print(f"Marketplace subscription {subscription_name} already exists for model {model_id}")

 

 

 

 

Create the fine tuning job using the the model and data as inputs

 

 

 

finetuning_job = CustomModelFineTuningJob( task=task, training_data=training_data, validation_data=validation_data, hyperparameters={ "per_device_train_batch_size": "1", "learning_rate": str(learning_rate), "num_train_epochs": "1", "registered_model_name": registered_model_name, }, model=model_to_finetune, display_name=job_name, name=job_name, experiment_name=experiment_name, outputs={"registered_model": Output(type="mlflow_model", name=f"ft-job-finetune-registered-{short_guid}")}, )

 

 

 

Submit the fine-tuning job

The following snippet will submit the previously created fine-tuning job to the Azure AI serverless platform. If the submission is successful, the job details including the Studio URL and the registered model name will be printed. Any errors encountered during the submission will be displayed as well.

 

 

 

 

try: print(f"Submitting job {finetuning_job.name}") created_job = workspace_ml_client.jobs.create_or_update(finetuning_job) print(f"Successfully created job {finetuning_job.name}") print(f"Studio URL is {created_job.studio_url}") print(f"Registered model name will be {registered_model_name}") except Exception as e: print("Error creating job", e) raise e

 

 

 

 

The full runnable code is available in the previously mentioned finetuning notebook.

Join the Conversation

We invite you to join our tech community on Discord to discuss fine-tuning techniques, RAFT, LoRA, and more. Whether you’re a seasoned AI developer or just starting, our community is here to support you. Share your experiences, ask questions, and collaborate with fellow AI enthusiasts. Join us on Discord and be part of the conversation!

 

cedricvidal_3-1726692301991.png

What’s next?

This concludes the second installment of our blog series on fine-tuning the Llama 3.1 8B model with RAFT and LoRA, harnessing the capabilities of Azure AI Serverless Fine-Tuning. Today, we’ve shown how these advanced technologies enable efficient and cost-effective model customization that precisely meets your domain needs.

 

By integrating RAFT and LoRA, you can transform your models into specialists that effectively navigate and interpret relevant information from extensive document repositories using RAG, all while significantly cutting down on the time and costs associated with full weight fine-tuning. This methodology accelerates the fine-tuning process and democratizes access to advanced AI capabilities.

 

With the detailed steps and code snippets provided, you now have the tools to implement serverless fine-tuning within your AI development workflow. Leveraging automation in AI Ops will help you maintain and optimize model performance over time, keeping your AI solutions competitive in an ever-changing environment.

 

Stay tuned! In two weeks, we’ll dive into the next topic: deploying our fine-tuned models.

Read the whole story
alvinashcraft
12 hours ago
reply
West Grove, PA
Share this story
Delete

What's new in the Chicago Manual of Style (18th edition), with Russell Harper and Mary Laur

1 Share

1015. The Chicago Manual of Style is updated every seven years, and this year's update is a big one! I talked with two of the editors — Russell Harper and Mary Laur — about the major changes, how the decisions get made, and the history of the CMOS (pronounced "sea moss").

🔗 Share your familect recording in a WhatsApp chat.

🔗 Watch my LinkedIn Learning writing courses.

🔗 Subscribe to the newsletter.

🔗 Take our advertising survey

🔗 Get the edited transcript.

🔗 Get Grammar Girl books

🔗 Join Grammarpalooza. Get ad-free and bonus episodes at Apple Podcasts or Subtext. Learn more about the difference

| HOST: Mignon Fogarty

| VOICEMAIL: 833-214-GIRL (833-214-4475).

| Grammar Girl is part of the Quick and Dirty Tips podcast network.

  • Audio Engineer: Dan Feierabend
  • Director of Podcast: Brannan Goetschius
  • Advertising Operations Specialist: Morgan Christianson
  • Marketing and Publicity Assistant: Davina Tomlin
  • Digital Operations Specialist: Holly Hutchings
  • Marketing and Video: Nat Hoopes

| Theme music by Catherine Rannus.

| Grammar Girl Social Media Links: YouTube. TikTok. Facebook.Threads. Instagram. LinkedIn. Mastodon.





Download audio: https://www.podtrac.com/pts/redirect.mp3/media.blubrry.com/grammargirl/stitcher.simplecastaudio.com/e7b2fc84-d82d-4b4d-980c-6414facd80c3/episodes/e6112b6b-4399-4ddd-a90a-a48cb3ade9e0/audio/128/default.mp3?aid=rss_feed&awCollectionId=e7b2fc84-d82d-4b4d-980c-6414facd80c3&awEpisodeId=e6112b6b-4399-4ddd-a90a-a48cb3ade9e0&feed=XcH2p3Ah
Read the whole story
alvinashcraft
12 hours ago
reply
West Grove, PA
Share this story
Delete
Next Page of Stories