Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
133217 stories
·
29 followers

Perplexity launches Sonar, an API for AI search

1 Share

Perplexity on Tuesday launched an API service called Sonar, allowing enterprises and developers to build the startup’s generative AI search tools into their own applications. “While most generative AI features today have answers informed only by training data, this limits their capabilities,” Perplexity wrote in a blog post. “To optimize for factuality and authority, APIs […]

© 2024 TechCrunch. All rights reserved. For personal use only.

Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft’s Steam-like browser overlay is now available on Windows 11

1 Share
The Microsoft Edge web browser logo against a swirling blue background.
Image: The Verge

Microsoft is rolling out its new in-game browser overlay on Windows 11 this week, after months of beta testing. The Microsoft Edge Game Assist feature is a widget that appears in the Game Bar in Windows 11 much like Valve’s Steam overlay browser. It’s also game-aware, so it can detect games you’re playing and offer up tips and guides in a little side panel.

The Game Assist overlay was previously restricted to beta users, but it’s now available in the stable version of Microsoft Edge. If you want to enable the in-game browser you can open up Microsoft Edge and go to Settings and more > Settings and then search for Game Assist and install the widget. The Game Assist feature will then be available in the Game Bar, which can be opened with the Windows key + G.

“The initial preview of Game Assist offers contextual tips and guides for a selection of popular PC games while we optimize the experience based on your feedback,” explains William Devereux, senior product manager for Microsoft Edge. Microsoft has also added support for more popular PC games, including Indiana Jones and the Great Circle, Marvel Rivals, and Dragon Age: The Veilguard.

“We’ll add tips and guides for even more popular games throughout the preview and over time,” says Devereux. “In the meantime, you can still use Game Assist to browse your favorite guides or other websites while playing any game.”

Game Assist works by using the same cookies, autofill, and favorites data from your main Microsoft Edge browser. Microsoft has also added support for extensions like ad blockers to this Game Assist feature, and it’s planning to add support for keyboard shortcuts in the future, alongside an improved picture-in-picture experience and the ability to add a tab from Microsoft Edge to the sidebar.

Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Game Developers Are Getting Fed Up With Their Bosses' AI Initiatives

1 Share
More than half of video game developers reported their companies are using generative AI in game development, according to an annual survey released Tuesday. The Game Developers Conference (GDC) report found that 52% of developers worked at companies using AI tools, while 30% felt negatively about the technology, up from 18% last year. Only 13% believed AI had a positive impact on games, down from 21% in 2024. One in 10 developers lost their jobs over the past year, with some reporting extended periods of unemployment. One developer cited in a Wired story said they submitted 500 job applications without success, while another reported being laid off three times in the last year. Covid-era over-expansion, unrealistic expectations, and poor management are being identified as key factors behind the industry's troubles.

Read more of this story at Slashdot.

Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft will automatically keep you signed in to your account starting in February

1 Share
Vector illustration of the Microsoft logo.
Image: Cath Virginia / The Verge

Microsoft is making some changes to the way you sign in to a Microsoft account next month. Starting in February, you will stay signed in to a Microsoft account automatically unless you sign out or use private browsing. It’s a change that people will need to be aware of, especially if they’re using a public computer.

Right now, if you sign in to a Microsoft account you’ll always be asked if you want to stay signed in, so you don’t have to sign in again next time. Microsoft’s change to automatically keeping you signed in means you’ll have to use a private browsing window on public PCs or make sure you remember to sign out once your session ends, otherwise the account will remain signed in.

If you do regularly use public PCs with your Outlook or Microsoft account then it’s definitely time to start getting used to signing out or using a private browsing session (which you should really be doing anyway). If you mistakenly forget to sign out of a Microsoft account in February, you can always force your account to be signed out on all browsers, apps, and anywhere else it’s being used apart from Xbox consoles.

Microsoft’s latest change to its account sign-in process comes months after the company added passkey support to all of its consumer accounts. You can create passkeys for your Microsoft account by following this link, and you can choose your face, fingerprint, PIN, or a security key to use a device to sign in with a passkey.

Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Trump moves to sink offshore wind

1 Share

President Trump signed an executive order that halted federal leases for offshore wind development on the outer continental shelf.

© 2024 TechCrunch. All rights reserved. For personal use only.

Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

How To Build Cost-Efficient Cloud Architectures for GenAI Workloads

1 Share

Reigning conversations from boardrooms to breakrooms, the Generative AI euphoria has led to overzealous investments, unchecked demands on underlying cloud infrastructure, localized use cases, and limited business value. An IBM report states GenAI is a significant factor for rising computing costs, up by 89% from 2023 to 2025, trending toward $76 billion by 2028. Despite this, 90% of CEOs still wait for GenAI to move past experimentation within their organizations.

Significant investments into cloud, teams, and solutions with GenAI as the focus has failed to yield intended returns, compelling more than half the enterprises with active GenAI investments to stall or abandon these projects in the next three years.

Here’s 3 Main Reasons:

High Total Cost of Ownership for LLMs

Enterprises leveraging pre-trained large language models (LLMs) incur inference costs, priced separately for API calls and task completion, depending on model capabilities, which can influence the final cost based on usage, complexity, and data volume. For instance, the GPT-4 model can offer contextually aware, detailed responses on broader topics. It is priced higher than its newer specialized version, GPT-4o, which is optimized for speed but may only offer concise output.

GPT-4 also has robust memory and can keep track of more extended conversations, whereas GPT-4o may be unable to. Regarding pricing, in a 128K context for a 1000-token prompt, the GPT-4 model charges $0.01 for input tokens and $0.03 as completion cost, totaling $0.04. In contrast, for similar parameters, the GPT-4o will cost $0.00250 for input and $0.01000 for completion, totaling $0.0125 – or at least three times cheaper. Enterprises that do not weigh the LLM capabilities with business needs will end up with a higher total cost of ownership.

Overprovisioned Resources

Enterprises that set up their LLMs in their cloud environments for enhanced data privacy invest heavily in a locally hosted model and setting up GenAI applications. This requires a scalable infrastructure that can handle a high volume of requests, large data storage, and deliver high performance. Enterprises tend to provision these cloud resources around the clock to avoid model latency. While it is easier to anticipate the resource requirements for a GenAI proof of concept, most tend to overestimate these while scaling up, leading to an underutilized cloud sprawl that is harder to govern or monitor and leads to overbilling. Cloud costs associated with GenAI are now twice as high as the cost of the underlying model.

Lack of Visibility into Cloud Pricing Models

Enterprises consider the cloud a cost-takeout measure and struggle to understand how cloud resources are billed. This makes it challenging to visualize cloud expenses. Even if they opt for pay-per-use — among the highest-priced cloud subscription options — or choose the right instances, the costs will trend upward, given that enterprises do not fully understand the cost structures or lack the optimal utilization strategy for GenAI deployments.

Running GenAI in the Cloud With a Budget

While GenAI is here to stay, enterprises need a strategy that considers GenAI’s cost complexity and technical challenges while creating a roadmap for scalable, sustainable, and profitable deployment. Here is a three-pronged approach to whiteboarding this strategy, with an explicit focus on cost-benefit analysis and stringent FinOps principles:

  • Optimize Cloud Instances: Enterprises should start by understanding workload needs to assess the computational and memory requirements. Hyperscalers offer AI-optimized instances that help reduce the cost of running AI workloads in the cloud. Monitoring and adjusting instance sizes can help avoid overprovisioning and ensure significant cost savings. For instance, spot instances are a viable option if designed to be fault-tolerant since they can be up to 90% cheaper than on-demand instances. Establishing clear scaling policies can help accurately align resources with demand using predictive models to anticipate demand spikes. Implementing budget cut-offs prevents scaling beyond a specified cost threshold.
  • Adopt FinOps at Scale: Fostering a cost-conscious culture, which includes regular reviews, audits, and optimization efforts, is vital to ensure ongoing efficiency and effectiveness. GenAI is best run in cloud environments with aggressive FinOps principles. A good start would be a commercial FinOps tool, or the one provided by the hyperscaler, clubbed with FinOps practices, such as right-sizing, cost tracking and auditing, and repurposing available resources. Enhanced cost visibility will facilitate better decision-making, and allocating costs with accurate tagging will provide deeper insights into cost distribution.
  • Leverage AI for Cost Control: AI-optimized hardware, such as custom chips like Google Tensor Processing Units (TPUs) and AWS Inferentia, delivers superior computational power tailored for AI tasks without cost overruns. Model efficiency can guide the selection of appropriate function-specific LLMs, further optimizing performance and resource usage. Enterprises can adopt advanced compression techniques to reduce memory and computational needs during training and inference. Serverless AI platforms that run on Function-as-a-Service (FaaS) allow hyperscalers to manage AI workloads flexibly without traditional server infrastructure. To safeguard data, enterprises should adopt substantial regulatory compliance and data governance measures that ensure data remains within local jurisdictions while protecting user privacy through anonymization and encryption.

Platform-Based Approach to GenAI Cost Governance

Understanding cloud cost structures, the computational requirements of GenAI workloads, and the underlying model is key to cost efficiency. Not every workload requires an LLM that handles high cognitive loads, which means not every GenAI application will need the same resources or instances in the cloud or cost the same. Understanding these dynamics and factoring them into the GenAI strategy can help enterprises make informed decisions about budgets, investments, and operating costs.

One way could be a hybrid-cloud environment for GenAI deployments, where enterprises orchestrate a best-fit capability stack that meets their budgetary requirements. A dashboard that collates resource utilization, cost centers, and consumption patterns can make tracking GenAI cloud investments easier. This can help enterprises set cost thresholds or policy-based pricing for LLM deployments across hyperscalers. A cheaper LLM may be used for routine use. In contrast, a higher cognitive load needs a more expensive LLM — all assembled from different hyperscalers via a GenAI platform that uses quantization to create smaller models for low-cost virtual machines.

A cost-efficient GenAI deployment depends on granular visibility into cloud spending and the technical needs of an AI workload — all tied to the business value delivered on the ground.

The post How To Build Cost-Efficient Cloud Architectures for GenAI Workloads appeared first on The New Stack.

Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories