Content Developer II at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
122693 stories
·
29 followers

A Comprehensive Guide to Function Calling in LLMs

1 Share
function calling

One of the proven techniques to reduce hallucinations in large language models is retrieval-augmented generation, or RAG. RAG uses a retriever that searches external data to augment a prompt with context before sending it to the generator, which is the LLM.

While RAG is the most popular approach, it is best suited for building context from unstructured data that has been indexed and stored in a vector database. Before RAG can retrieve the context, a batch process converts the unstructured data into text embeddings and stores that in the vector database. This makes RAG ideal when dealing with data that doesn’t change often.

When applications require context from real-time data — such as stock quotes, order tracking, flight statuses or inventory management — they rely on the function-calling capability of LLMs. The goal of both RAG and function calling is to supplement the prompt with context — whether from existing data sources or real-time APIs — so that the LLM has access to accurate information.

LLMs with function-calling capabilities are foundational to the development of AI agents that perform specific tasks autonomously. For instance, these capabilities allow for the integration of LLMs with other APIs and systems, enabling the automation of complex workflows that involve data retrieval, processing and analysis​.

A Closer Look at Function Calling

Function calling, also known as tool use or API calling, is a technique that allows LLMs to interface with external systems, APIs and tools. By providing the LLM with a set of functions or tools, along with their descriptions and usage instructions, the model can intelligently select and invoke the appropriate functions to accomplish a given task.

This capability is a game-changer, as it enables LLMs to break free from their text-based limitations and interact with the real world. Instead of merely generating text, LLMs can now execute actions, control devices, retrieve information from databases, and perform a wide range of tasks by leveraging external tools and services.

Not every LLM is capable of utilizing function-calling capabilities. Those LLMs that are exclusively trained or fine-tuned possess the ability to determine whether the prompt demands function calling. The Berkeley Function-Calling Leaderboard provides insight into how LLMs perform across various programming languages and API scenarios, showing the versatility and robustness of function-calling models in handling multiple, parallel and complex function executions​. This versatility is crucial for developing AI agents that can operate across different software ecosystems and handle tasks that require simultaneous actions.

Applications typically invoke the LLM with function-calling capabilities twice: once to map the prompt into the target function name and its input arguments, and again to send the output of the invoked function to generate the final response.

The workflow below shows how the application, function and LLM exchange messages to complete the entire cycle.

Step 1: The user sends a prompt that may demand access to the function — for example, “What’s the current weather in New Delhi?”

Step 2: The application sends the prompt along with all the available functions. In our example, this may be the prompt along with the input schema of the function get_current_weather(city). The LLM determines whether the prompt requires function calling. If yes, it looks up the provided list of functions — and their respective schemas — and responds with a JSON dictionary populated with the set of functions and their input arguments.

Step 3: The application parses the LLM response. If it contains the functions, it will invoke them either sequentially or in parallel.

Step 4: The output from each function is then included in the final prompt and sent to the LLM. Since the model now has access to the data, it responds with an answer based on the factual data provided by the functions.

Integrating RAG and Function Calling

The integration of RAG with function calling can significantly enhance the capabilities of LLM-based applications. RAG agents based on function calling utilize the strengths of both approaches — leveraging external knowledge bases for accurate data retrieval while executing specific functions for efficient task completion.

Using function calling within a RAG framework enables more structured retrieval processes. For instance, a function can be predefined to extract specific information based on user queries, which the RAG system retrieves from a comprehensive knowledge base. This method ensures that the responses are not only relevant but also precisely tailored to the needs of the application.

For example, in a customer support scenario, the system could retrieve product specifications from a database and then use a function call to format this information for user queries, ensuring consistent and accurate responses.

Moreover, the RAG agents can handle complex queries by dynamically interacting with external databases and APIs through predefined functions, thereby streamlining application workflows and reducing the need for manual intervention. This approach is particularly beneficial in environments where quick decision-making is crucial — such as in financial services or medical diagnostics, where the system can pull the latest research or market data and immediately apply functions to analyze this information.

Choosing LLMs With Function Calling Support

It’s important to choose the right LLM that supports function calling to build agentic workflows and RAG agents. Below is a list of commercial and open LLMs that are ideal for function calling.

OpenAI GPT-4 and GPT-3.5 Turbo

OpenAI’s GPT-4 and GPT-3.5 Turbo models are the most well-known commercial LLMs that support function calling. This allows developers to define custom functions that the LLM can call during inference, to retrieve external data or perform computations. The LLM outputs a JSON object containing the function name and arguments. The developer’s code can then execute this and return the function output to the LLM.

Google Gemini

Google’s Gemini LLM also supports function calling through the Vertex AI and Google AI Studio. Developers can define functions and descriptions, which the Gemini model can invoke during inference by returning structured JSON data.

Anthropic Claude

Anthropic’s Claude 3 family of LLMs has an API that enables function-calling capabilities similar to OpenAI’s models.

Cohere Command

Cohere’s Command R and Command R+ LLMs also provide an API for function calling, allowing integration with external tools and data sources.

Mistral

The open source Mistral 7B LLM has demonstrated function-calling capabilities, allowing developers to define custom functions that the model can invoke during inference.

NexusRaven

NexusRaven is an open source 13B LLM that has been specifically designed for advanced function calling, surpassing even GPT-4 in some benchmarks for invoking cybersecurity tools and APIs.

Gorilla OpenFunctions

The Gorilla OpenFunctions model is a 7B LLM that is fine-tuned on API documentation. It can generate accurate function calls and API requests from natural language prompts.

Fireworks FireFunction

FireFunction V1 is an open source function calling model based on the Mixtral 8x7B model. It achieves near GPT-4 level quality for real-world use cases of structured information generation and routing decision-making.

Nous Hermes 2 Pro

Hermes 2 Pro is a 7B parameter model that excels at function calling, JSON-structured outputs and general tasks. It achieves 90% accuracy on function-calling evaluation and 81% on structured JSON output evaluation built with Fireworks.ai. Hermes 2 Pro is fine-tuned on both the Mistral 7B and Llama 3 8B models, offering developers a choice.

In upcoming articles on function calling, I will explore how to implement this capability with both commercial and open LLMs, in order to build a chatbot that has access to real-time data.

The post A Comprehensive Guide to Function Calling in LLMs appeared first on The New Stack.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

Meta Releases Open Source React Compiler

1 Share
React Meta engineer Joe Sonova took the stage at React Conference 2024 to explain React Compiler.

Meta released an open source compiler for React on Wednesday at the React Conference, which was held in Las Vegas and live-streamed. Joe Savona, a member of the React team at Meta and a user interface engineer, said the team had been developing the compiler over the past few years.

“React Compiler automatically optimizes your components and hooks, so that only the minimal parts of your UI update as state changes,” Savona told audiences. “So that seems pretty magical.”

Why a React Compiler?

Why would an interpreted language need a compiler? Savona compared React Compiler to Meta’s Hermes and the V8 Alliance’s V8, which are JavaScript engines that include compilers.

“React Compiler is more like TypeScript, or the compilers inside JavaScript engines, like V8 or Hermes,” he said. “It breaks your code down into individual expressions and builds up a control flow and data flow graph. It performs sophisticated optimizations, like dead code elimination, constant propagation, type inference, even alias analysis and a whole bunch more.”

These optimizations would typically appear inside a JavaScript engine or compiler for languages such as Rust or C++, he added. The React Compiler applies these ideas to JavaScript to improve performance without sacrificing developer experience, he continued. The code doesn’t change but the apps and updates become faster by default, he said.

“In fact, our code gets even more clean and concise, because we don’t need manual memoization,” Savona added. Memoization is an optimization technique that leverages caching.

Applying React Compiler to Code

He pointed to an earlier demo of a media player built with React. There’s no use memo calls, no random and unnecessary callback functions, but there is a lot of information contained in the code that developers might not always think about, he said.

“For example, consider the filtered songs list,” he said. “React is obviously going to have to filter that songs list again and update the playlist. This code also tells us the inverse; if a song doesn’t change, then React shouldn’t have to update the filtered songs and it doesn’t need to update the playlist.”

That’s exactly what happens with the compiler enabled, he added: There’s no need to update the playlist when the song doesn’t change.

“We’re able to understand this because JavaScript and React have clear rules for how they work and we’ve learned those rules,” Savona said. “We use them every day as we’re reading code, writing code, debugging code and occasionally testing code.”

The compiler can be taught to understand those rules. Then it can “see” the code “very much the same way that we do, except it’s way more thorough,” he added.

It can apply the same process to every value in the code, creating a computation graph of how the data moves through the user interface, he said.

“The compiler knows the rules of React, and it knows some extra things,” he continued. “For example, it knows that set state functions don’t actually change, so the compiler doesn’t have to consider whether that value could update. So we can remove that dependency edge.”

Then the only thing that needs to update is the now playing, rather than the whole playlist.

“The best part is that all this information comes from the same React components and hooks that you’re already writing today,” he said. “There’s no API to learn. There’s no need to change the way you write code. We can simply take advantage of all the information that’s already right there in your code today.”

These types of precise UI updates are sometimes called fine-grained reactivity, he added.

Next, Mofei Zhang, a Meta software engineer, took the stage to show to how the compiler performed when used on apps at Meta. On pages with a lot of interactivity or components, Meta found the developers started making trade-offs.

“A small sample of the code that we saw was pretty difficult to read,” Zhang said. “This code contained a lot of manual optimizations, which helped make the UI feel snappier, but also made the code much more difficult to reason about.”

She showed a code snippet that had almost 20 places where the developer had hand-optimized an object.

Meta Tries React Compiler on Instagram and Quest

Then in 2023, Meta began rolling out React Compiler on two apps: Instagram.com and its Quest store, the app store for Meta’s Quest VR devices. The compiler-optimized version was more than twice as fast as before, she said. Initial load times and cross-page navigation times improved by up to 12%. And there was no impact to either memory usage or overall crashes, which can reflect out-of-memory errors. Quest loads for navigation was faster by at least 4%. Instagram saw an average 3% improvement across every route.

“At Meta, this was a really big deal,” Zhang said. “To put these numbers in context, engineers had combed through every single bit of these apps to add thousands of memoization calls.”

Typically, just a 1 or 2 percentage point improvement in metrics, such as first time to paint, on a specific page would be a huge deal, she added.

“React Compiler had significantly improved the performance of nearly every page it rolled out on,” she said. “On less optimized apps, we found that React Compiler can add over 15 times the amount of memoization already present in source.”

Using the compiler will allow developers to get fast interactions on apps that haven’t been manually optimized while improving the readability and maintainability of already optimized apps, she concluded. Developers can even strip out the hand-coded memoization.

“We’ve just seen how React Compiler can even raise the ceiling for just how fast apps can be, as it can optimize much more than developers can with nested and conditional memoization, as well as sophisticated static analysis,” she said. “We’ve continued to roll out to even more apps to iterate on the compiler.”

In addition to its open source code, there’s React Compiler playground for developers who want to explore how it works.

Savona took the stage to wrap up.

“React Compiler allows developers to continue writing the exact same code you’re used to,” he said. “In fact, as we saw, we can stop using manual memoization. … React Compiler can deliver significant performance improvements to real-world applications.”

The post Meta Releases Open Source React Compiler appeared first on The New Stack.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

Building Better Azure Apps with Better Together

1 Share
From: Microsoft Developer
Duration: 10:45

Helping you build better apps has been one of our key focus areas in Azure. Our latest tooling focuses on providing guidance for architecting, optimizing, and deploying apps. Whether you’re creating a new proof of concept or improving an existing app, these capabilities can boost productivity and performance. These capabilities are all in Preview, so please give them a try and let us know what you think!

https://aka.ms/bettertogetherblog

0:00 Intro
0:32 Better Together Azure Copilot
2:05 Diagnose and Solve Issues
3:36 Better Together Post-Create
10:21 Conclusion

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

FreeCodeSession - Episode 545

1 Share
From: Jason Bock
Duration: 26:35

In this episode, I move common serialization code into one location, I mention how long it took to get open generics in Rocks to work, I update generated code to call the shared context methods, I count parenthesis, I complete updating code generation, and I decide to test in the next episode.

Links:
https://github.com/JasonBock/CslaGeneratorSerialization/issues/6

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

How a Product Owner Can Foster Agile Team Success with Collaboration and Inspiration | Peter Müller

1 Share

Peter Müller: How a Product Owner Can Foster Agile Team Success with Collaboration and Inspiration

Read the full Show Notes and search through the world’s largest audio library on Scrum directly on the Scrum Master Toolbox Podcast website: http://bit.ly/SMTP_ShowNotes.

The Great Product Owner: How a PO Can Foster Agile Team Success with Collaboration and Inspiration

In this segment, Peter celebrates the qualities of an exemplary Product Owner who champions shared ownership and visionary leadership. Peter explains how a confident and forward-thinking Product Owner can significantly impact a team’s performance and satisfaction. He also shares examples of how such Product Owners facilitate a proactive environment, encourage stakeholder engagement, and lead by envisioning a collective future. 

The Bad Product Owner: Transforming the Micro-Manager, How Scrum Masters can Help PO’s

This segment unveils the challenges and setbacks of a micro-managing Product Owner. Peter examines how such behaviors can demoralize a team, stifle creativity, and lead to a passive work environment. Through various anecdotes, he illustrates the pitfalls of excessive control and the importance of fostering a shared ownership culture within agile teams. We also discuss some possible strategies that can help Product Owners overcome the urge to micromanage and instead build a collaborative team dynamic.

 

[IMAGE HERE] Are you having trouble helping the team work well with their Product Owner? We’ve put together a course to help you work on the collaboration team-product owner. You can find it at bit.ly/coachyourpo. 18 modules, 8+ hours of modules with tools and techniques that you can use to help teams and PO’s collaborate.

 

About Peter Müller

Peter is a seasoned Agile coach and transformation consultant with extensive experience in fostering agile environments and enhancing team dynamics. His expertise in solution-focused coaching has helped numerous teams optimize their operational efficiency and adapt to agile methodologies effectively.

 

You can link with Peter on LinkedIn and connect with Peter on Twitter.





Download audio: https://traffic.libsyn.com/secure/scrummastertoolbox/20240517_Peter_Mller_F.mp3?dest-id=246429
Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete

Why can’t I find the injected name of a templated class’s templated base class?

1 Share

Some time ago, I wrote about how injected class names were the C++ feature you didn’t even realize that you were using. Injected class names let you use the plain name for the class being defined without needing to fully qualify it with namespaces and template parameters. Furthermore, injected class names are public and can be inherited.

“But wait, I’m trying to use the injected class name of my base class, but the compiler won’t accept it.”

template<typename T>
struct Base
{
    Base(T value);
};

template<typename T>
struct Derived : Base<T>
{
    Derived(T value) : Base(value) {}
};

This generates a compiler error.

// clang
error: member initializer 'Base' does not name a non-static data member or base class

    Derived(T value) : Base(value) {}
                       ^~~~~~~~~~~

// gcc
error: class 'Derived<T>' does not have any field named 'Base'

    Derived(T value) : Base(value) {}
                       ^~~~

// msvc
error C2512: 'Base<T>': no appropriate default constructor available
error C2614: illegal member initialization: 'Base' is not a base or member

While it’s true that Base is an injected type name of Base, it is also a dependent type, since its presence is indirectly dependent on the template type parameter T. Specifically, the name Base is provided by Base<T>, and that is dependent on T.

Now, you and I can plainly see that no matter how Base<T> is defined or specialized, there will always be a type named Base that refers to Base<T> due to name injection, but the compiler doesn’t do this sort of fancy logical deduction.

Consider this alternate formulation:

template<typename T>
struct Base
{
    Base(int value);
    using Type = T;
};

template<typename T>
struct Derived : Base<T>
{
    Type m_value;
};

Here, Type is much more obviously a dependent type, since it is patently dependent on T. (And you might have a specialization of Base<T> which defines Type differently, or maybe doesn’t even define it at all.)

At the time a template is parsed, non-dependent names are resolved immediately. The dependent names are resolved at the time a template is instantiated. This is known in the jargon as two-phase name lookup. Or as the LLVM blog once called it, the dreaded two-phase name lookup.

One solution is to eschew the injected name and use the original full name:

template<typename T>
struct Derived : Base<T>
{
    Derived(T value) : Base:<T>(value) {}
};

However, this can get clumsy if the base type name is unwieldy, say, because it comes from another namespace, or the template parameters are complex.

template<typename T>
struct Derived :
    Contoso::Faraway::Base<                      
        std::conditional_t<std::is_class_v<T>, T,
            std::tuple<T>>>                      
{
    Derived(T value) :
        Contoso::Faraway::Base<                      
            std::conditional_t<std::is_class_v<T>, T,
                std::tuple<T>>>(value) {}            
};

But wait, all is not lost. We just need to postpone the lookup to the second phase by making it a dependent type. And there are two common ways of doing this.

  • For data members and member functions, explicitly prefix them with this-> to make them dependent on the full definition of the template class.
  • For static members and member types, explicitly scope-qualify them with the name of the template class, and prefixing with typename if you’re looking for a member type.

(If you want to access a static data member or static member function, both options are available, since you are allowed to use this to access a static data member or static member function.)

Therefore, one solution here is to write

template<typename T>
struct Derived : Base<T>
{
    Derived(T value) : Derived::Base(value) {}
};

Another is to explicitly import the type name Base:

template<typename T>
struct Derived : Base<T>
{
    using Base = typename Derived::Base;

    Derived(T value) : Base(value) {}
};

Additional reading: Why am I getting errors when my template-derived-class uses a member it inherits from its template-base-class?

The post Why can’t I find the injected name of a templated class’s templated base class? appeared first on The Old New Thing.

Read the whole story
alvinashcraft
2 hours ago
reply
West Grove, PA
Share this story
Delete
Next Page of Stories