Content Developer II at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
121675 stories
·
29 followers

What Does WebAssembly Mean for the Server and GenAI?

1 Share
Server room

Solomon Hykes, one of the co-founders of Docker was quoted saying “If WASM+WASI existed in 2008, we wouldn’t have needed to create Docker. That’s how important it is. WebAssembly on the server is the future of computing.”

In a previous blog post “WASM your way to the cloud” I explored what Wasm meant for cloud native architectures along with some background on the history and when it entered the chat for cloud. Now, I plan to explore why Wasm matters to the server.

This of course can mean many things, including why the footprint and performance of Wasm applications are important, why that matters to Generative AI and how the security and multiarchitecture aspects are key to a happier server. Let’s dive in.

The Fine Qualities of WASM

Wasm offers similar conceptual benefits to containers, Wasm binaries are multilanguage, portable, performant and secure. These characteristics excited the tech industry when containers and microkernels came into the picture in the last decade. Let’s explore why these characteristics for Wasm applications represent a new wave of excitement.

Portability

The portability allows an application to be deployed, moved and transferred across environments, which is critical to today’s multicloud and multicluster architectures. Wasm achieves portability by offering a binary format that allows code to run on a variety of architectures. This is one area where it’s worth distinguishing between Wasm’s portability and container compatibility. With container images, the type of CPU architecture, distribution and version of an operating system matters. For example, if you wanted to build a container for a specific system architecture, you would have to invoke the --platform flag in the build process and tag it appropriately for each architecture on which you would like to run the container.

Operating systems are also a consideration. For instance, there is no way to run Windows containers on Linux without virtualization because Windows containers need the specific Windows kernel, Hyper-V host or support for host compute service (HCS). This is one of the areas where Wasm tends to shine. Wasm applications are made up of “modules” (see WebAssembly Component Model), and these modules are compiled down to a bytecode (binary format). This binary format makes it much more portable than that of specific languages such as JavaScript, Python or Rust.

Example WASM Application

We will use a Wasm application framework from Fermyon called spin to help demonstrate this. Fermyon recently released SpinKube during KubeCon EU Paris 2024 as a Cloud Native Computing Foundation(CNCF) sandbox project. It can enable much the same use case we will be showing here; check that out separately. Spin takes advantage of the component model along with the Wasmtime runtime. Wasmtime uses the component module along with WebAssembly System Interface (WASI), which allows Wasm applications to run on the server instead of the web. WASI provides access to operating system features like filesystems, networks and more in a POSIX-like way.

This example will show how to build a Rust-based Wasm application producing the .wasm binary using Rust’s Wasm+WASI target wasm32-wasi and simply running it both on Ubuntu Linux 23.10 as well as Windows 11 without modification.

> Note, we won’t go into the full Spin and Rust environment here, for more information check out the Spin quickstart or try it yourself in this lab.

First, the Rust application is defined as a simple HTTP handler and response.

Then we add the Rust Wasm+WASI target to the development environment.

Next, build the .wasm binary.

This produces the .wasm binary in the target build folder.

Now we can run the Wasm module on the Ubuntu server.

Then, access the application, which serves a simple HTTP response.

Next, upload our Wasm module to the GitHub Packages repo (ghcr).

Now on our Windows 11 Desktop, we can run the Wasm module from PowerShell with one command without any modification across platforms.

I want to be clear; this simplistic example is not entirely dissimilar to running code on Linux and Windows. However, even with this simple example, there was no need for any application or library-specific dependencies to be installed, just Spin, which knows how to run the Wasm binaries since we didn’t need to rebuild it. This example is really to highlight the portability and usefulness of the binary format across Linux and Windows; applications can, of course, become more complex.

Security

Wasm improves the overall security and attack vector footprint by running each Wasm module in its own sandboxed environment isolated from the host runtime. This means a running Wasm application has no visibility to the host operating system outside of how the runtime and any system resources can only be accessed through WASI.

Wasm also provides a linear memory concept that provides a contiguous array of memory with a maximum size. This is an isolated memory region for the Wasm module to use. Misuse of this memory can cause traps (exceptions) to occur that are reported up the stack in the runtime. This provides various memory safety aspects for the isolated region of memory available to the application compared to the memory available in the runtime, which helps obviate, but not eliminate, certain memory safety bugs such as buffer overflows and unsafe pointers. Most of the details are beyond the scope of this article.

Wasm is a fast-moving project and as the spec and implementation grows over time this does not eliminate potential bugs or security concerns in runtimes, however, Wasm’s design for isolation and sandboxes does positively set up the overall security posture.

Performance

Wasm also offers impressive performance, and this is important because even if something were hyper-portable if it weren’t an efficient computing mechanism, it wouldn’t really be that useful. Wasm is inherently more compact than alternatives in browsers such as JavaScript. This is because Wasm is compiled into a binary containing bytecode rather than something like JavaScript, which is interpreted. The result, for example, is that a native Rust application built into a container can be much larger and slower on initial startup than a Rust-based Wasm binary module.

Check out the size difference of this Rust HTTP server Wasm module compared to a container that does the same thing built off the rust:1.77 image.

Rust Wasm module

Using the rust:1.77 container image:

Using the rust:1.77-slim container image:

Using the rust:1.77-slim container image:

There are ways to use multistage builds or scratch images to achieve smaller image sizes to get the size down to tens of MB, however, this is not likely to be the way many developers start. It should be noted that by using these techniques, we could produce a much smaller image below, however, it is still five times larger than the Wasm binary alone

Using the multistage build with rust:alpine container image:

Linear memory also makes memory access generally more efficient along with other aspects like parallel threads to boost CPU consumption. Performance is and will be an important characteristic of applications running in public and private clouds including use cases such as serverless and AI.

What This Means for the Server

Portability, security and performance explain why Linux container technology has thrived in today’s data center architectures, and Wasm can extend these benefits for applications beyond the browser running on the server. Let’s look at a few examples.

Cloud Applications

Wasm applications using frameworks for the server such as WASI, Wasmtime and WasmEdge can benefit from many of the same developer toolchains that containers use as an onramp, as we have shown above. That means there is a natural progression for development teams to start experimenting with Wasm on the server. Using existing cloud native tooling is not the only way Wasm applications can run, but does make it easier for teams developing cloud-based applications with containers and Kubernetes to start implementing Wasm modules. The portability and performance of these Wasm modules can also improve density and start times for applications.

Edge, IoT and Serverless

The form factor of Wasm applications being portable and small makes it ideal for decentralized computational workloads at the edge. Wasm applications take up less space, which is perfect for many edge and Internet of Things (IoT) use cases dealing with limited compute capacity. The performance and fast startup times of Wasm binaries also make it an ideal candidate for serverless computing where cold start can be minimized for serverless functions.

Generative AI

GenAI and its use cases are quickly becoming one of the most focused technologies within most companies and rightfully so. GenAI has many compelling use cases, and there has been a lot of recent work within the cloud native and Kubernetes communities as well as in the enterprise to enable AI workloads for consumers.

If you would like to dig in, check out this design guide as an example. 

GenAI has various architectures and components depending on the use case, however many of these stacks share common components.

Two main components of the GenAI stack are models such as Llama 2, Mistral, GPT-4 and others that run on GPUs along with the inferencing components, which handle talking with a model and handling inferencing requests. Inferencing is the operational task that runs data through a model for completing a task.

For instance, if you were to ask a question such as “Tell me a joke about dogs” to OpenAI’s ChatGPT, it takes the question and performs an inferencing step to provide you with a response.

Inferencing steps are one example where Wasm can fit quite nicely. Take this example from Fermyon using its Spin framework where Wasm modules are used in a serverless architecture. The example Wasm application can be defined as:

This shows how Wasm can be used within the Spin framework and cloud. You can read about it here to learn about how taking advantage of the small footprint, performance and security of Wasm means improvements in density, potential cost optimizations and wider use cases for where Wasm modules and AI can run at the edge. Wasm can also be thought of as a universal runtime for the deployment of AI components, making it easy for researchers and developers to run models via Wasm runtimes that are highly portable.

Conclusion

Wasm for the server, in my opinion, is here to stay for the server and will grow for years to come. The portability, performance and security that the industry loves about containers are extended to the architecture and the deployment of WebAssembly applications on the server. With the growing support of Wasm within the cloud native ecosystem toolchain and the obvious connections with use cases like edge, IoT, serverless and GenAI, there is no doubt that Wasm will be important to the server.

The post What Does WebAssembly Mean for the Server and GenAI? appeared first on The New Stack.

Read the whole story
alvinashcraft
9 minutes ago
reply
West Grove, PA
Share this story
Delete

Teams Toolkit for Visual Studio Code update – April 2024

1 Share

In this April 2024 update of Teams Toolkit for Visual Studio Code, we’ve added improvements to build API message extensions with an auth-protected API, new getting started experiences for building intelligent chat bots, and features to build add-ins for Word, Excel, and PowerPoint, and more.

Create API based message extensions using auth-protected API

Teams Toolkit supports two types of API authentication protection in your API based Message extension apps:

image

  • API-Key: you can either add the API key of your existing API, or if you don’t have an API, Teams Toolkit can generate create a new API project for you.
  • Microsoft Entra (Azure AD): Teams Toolkit can help you create a Microsoft Entra ID to authenticate your new API.

Debug message extensions in Teams App Test Tool

Teams App Test Tool helps developers to debug and test apps in a web-based environment Microsoft Teams-like features without requiring network tunnels or a Microsoft 365 account. In this version we’ve added support for search-based, action-based, and link unfurling Message extensions.

image

Running your Teams apps with these capabilities with the Test Tool shows a familiar interface that makes iterating on your app simple and fast.

image

Create an intelligent chatbot with domain knowledge from custom data

The new Custom copilot template helps you get started with building an AI-powered chatbot that can understand natural language and retrieve custom data to answer domain-specific questions using Retrieval Augmentation Generation (RAG).

When creating the Custom copilot app, you can select “Chat with your data” and then select the desired data source.

image

There are four kinds of data source for you to choose:

image

  • Custom data source: you can add whatever data sources you want to a Custom copilot app, for example: file system or vector database.
  • Azure AI Search: your chatbot can access data on an Azure AI Search service and use it in conversation with users.
  • Custom API: your chatbot can invoke the API defined in the OpenAPI description document to retrieve domain data from an API service.
  • Microsoft Graph + SharePoint: your chatbot can query M365 data from the Microsoft Graph Search API as a data source in the conversation.

Develop Word, Excel and PowerPoint Add-ins in Teams Toolkit

image

Teams Toolkit now supports Microsoft Word, Excel, or PowerPoint JavaScript add-in development and includes features for checking dependencies, running and debugging add-ins, managing lifecycle, providing feedback, and more.

Enhancements

We’ve smoothed the experience of creating Entra ID client secrets with features that let you customize the `clientSecretExpireDays` and `clientSecretDescription` parameters in teamsapp.yml.

image

Share your feedback

We’d really appreciate your early feedback! Download the latest prerelease of Teams Toolkit and explore these new features and improvements today!

Remember, your feedback is valuable in shaping the future of Teams Toolkit. Share your thoughts and suggestions with us on GitHub, and let’s build together!

Follow us on X (Twitter) / @Microsoft365Dev, LinkedIn, and subscribe to our YouTube channel to stay up to date on the latest developer news and announcements.

The post Teams Toolkit for Visual Studio Code update – April 2024 appeared first on Microsoft 365 Developer Blog.

Read the whole story
alvinashcraft
9 minutes ago
reply
West Grove, PA
Share this story
Delete

Is GenAI the next dot-com bubble?

1 Share
The home team talks about the current state of the software job market, the changing sentiments around AI job opportunities, the impact of big players like Facebook and OpenAI on the space, and the challenges for startups. Plus: The philosophical implications of LLMs and the friendship potential of corvids.
Read the whole story
alvinashcraft
9 minutes ago
reply
West Grove, PA
Share this story
Delete

Enterprise best practices: successfully govern your API content and users

1 Share

There are a ton of things in Postman that enterprises should be taking advantage of to accelerate collaboration and consistency at scale. In this blog post, we’ll go over actionable information and resources related to user management, naming standards, SSO/SCIM, and much more.

Collaborative Postman workspaces enable every person who has access to the workspace to see the same collections, environments, and other assets. This allows for shared-context building that is much faster than traditional modes of collaboration through sending documents over email, and it’s more interactive than traditional portals or docs.

When you provide colleagues and collaborators with better context around your business and stakeholders, you help users make more successful API calls—faster.

Workspace naming conventions

It’s important to name your workspace so that others can understand it without talking to you or sending you an email or additional message.

Have a workspace with the same name as your business unit where you list all the collections and APIs that your team is responsible for. This can be an example workspace that is used as a reference for organizational taxonomy, etc. When designing your taxonomy structure, it’s important to note the hierarchy of elements in the Postman AP Platform:

  • Level 1 = Workspace
  • Level 2 = Collection, API, etc.
  • Level 3 = Additional description for collection, API, etc.

Don’t have multiple workspaces with the same name; this will lead to workspace chaos. Team workspaces should facilitate intentional collaboration.

User Groups

Postman workspaces have User Groups in which team members are organized into functional groups to mimic your organizational structure. This makes it easy to manage and assign roles for all the members of a group.

  • Similar to an IDP (Identity Provider) User Group, Admins and Super Admins have capability to create, manage, and delete. Developers can also create, manage, and delete developer-only groups.

Organizing groups to represent the organizational structure of your team is beneficial because it gives every user of your Postman instance the ability to gain context into the departments, teams, and user relationships while editing and collaborating on APIs.

Groups are all about streamlining management over many users and workspaces. If you’re on a team of developers that administer and contribute to a dozen workspaces within Postman, you should not have to add members to every workspace manually.

Roles within a Postman workspace

  • Team-level roles: Super Admin, Admin, Billing,  Community Manager, API Network Manager, Partner Manager, Developer and Partner.
  • Workspace-level roles: Admin, Editor and Viewer access across the whole workspace.
  • Other Postman resources: Editor and viewer access are available across collections, APIs, environments, mock servers and monitors.

Tips when creating collections

  • If your collection represents an API functionality, the collection name should summarize the API functionality.
  • Naming your environment with known prefixes like “-dev” will make it easier for others to pick the right environment from the dropdown while testing the API.
  • Never store sensitive information like keys, request params that represent production traffic (examples: Any PII data like credit cards, addresses etc.; IP addresses of your production servers; sensitive data like the new product your company will be launching etc or any other customer data).

SSO and SCIM

  • Once you have SSO set up, you can take your team automation one step further by enabling SCIM. Many organizations may already have an Identity Access Manager set up, simplifying the process of giving the right privileges to the right employees on the right app. Postman can plug right into that provisioning workflow, using the Postman SCIM API, Okta, or Azure AD. Creating, activating, and deactivating a user can be done by team admins, making the process quick and less prone to human error.
  • With SCIM enabled, users won’t have the option to leave your team on their own, and they won’t be able to change their account email or password. Only Team Admins will have the right to remove team members.

Domain capture

  • Once toggled on and paired with SSO, you can consolidate all of the existing Postman users into one company-managed account that shares your domain. Additionally, you can automatically direct any new users onto the account as well with Just In Time provisioning on the SSO configuration.
  • Note that this feature is a little more involved than just a toggle or filling in some fields: since “captured” users will be essentially locked out of their original accounts, it is vital to have a well-communicated plan in place to help everyone prepare and migrate any needed data. Postman’s support and customer success teams are always available to advise and help make the process go as smoothly as possible.

Next steps

By taking advantage of these common approaches to content and user management within your Postman team, you can maintain your instance with a high bar of quality, while ensuring that users get the best experience possible. You may be wondering, what next steps should I take?

  • Start with a taxonomy review of your current naming conventions and standards. This can include a subset of the team that can dig into the domain and business to figure out what language will make your team productive and give context.
  • Enable users to sign in through an SSO provider if you have one. Additionally, map users from your IDP in groups via SCIM provisioning.
  • Make a plan to claim “unclaimed” users that are a part of your domain, but do not exist on your Postman team (they may have their own separate individual teams).
  • Start to create and map groups in Postman with your identity provider groups, and create user groups that align with the business outcomes.

Don’t forget to register here to attend POST/CON 24, Postman’s biggest API conference ever: April 30 to May 1, 2024 in San Francisco.

The post Enterprise best practices: successfully govern your API content and users appeared first on Postman Blog.

Read the whole story
alvinashcraft
10 minutes ago
reply
West Grove, PA
Share this story
Delete

Mr. Maeda's Cozy AI Kitchen Desserts Corner - What is AI?

1 Share
From: Microsoft Developer
Duration: 5:19

A special bonus mini-episode of the Cozy AI Kitchen this week. Time for some dessert!! After serving up all sorts of AI for the past few months, John Maeda sits down with his friends Eleanor Lewis and John Kennedy to chat about AI and what it means to them. Please join the discussion and let us know how you define AI in the comments below.

Learn more about AI: https://msft.it/6055YH3nd

Read the whole story
alvinashcraft
10 minutes ago
reply
West Grove, PA
Share this story
Delete

Get started with your first Radius application

1 Share
From: Microsoft Developer
Duration: 13:24

Join Aaron and Ryan to learn how to get up and running with your first Radius application. They'll cover best practices as well as some tips and tricks to help get your app up and running on Radius.

Chapters:
00:00 - Introduction
00:26 - What is Radius
01:20 - Radius Example
08:22 - Liveness
12:38 - Where to go Next

Resources:
Learn more at: https://radapp.io
Find the latest info about the open-source Radius project available at: https://github.com/radius-project

📌 Let's connect:
Aaron Crawfis | https://twitter.com/AaronCrawfis
Ryan Nowak | https://twitter.com/aVerySpicyBoi

📝Submit Your OSS Project for Open at Microsoft https://aka.ms/OpenAtMsCFP

Subscribe to the Open at Microsoft: https://aka.ms/OpenAtMicrosoft

Open at Microsoft Playlist: https://aka.ms/OpenAtMicrosoftPlaylist
📆 New episode every Tuesday!

Read the whole story
alvinashcraft
10 minutes ago
reply
West Grove, PA
Share this story
Delete
Next Page of Stories