Read more of this story at Slashdot.
Read more of this story at Slashdot.
AI is everywhere, but speed, privacy, and reliability are critical. Users expect instant answers without compromise. On-device AI makes that possible: fast, private and available, even when the network isn’t - empowering apps to deliver seamless experiences.
Imagine an intelligent assistant that works in seconds, without sending a text to the cloud. This approach brings speed and data control to the places that need it most; while still letting you tap into cloud power when it makes sense.
Windows AI Foundry is a developer toolkit that makes it simple to run AI models directly on Windows devices. It uses ONNX Runtime under the hood and can leverage CPU, GPU (via DirectML), or NPU acceleration, without requiring you to manage those details.
The principle is straightforward:
Foundry Local is the engine that powers this experience. Think of it as local AI runtime - fast, private, and easy to integrate into an app.
This approach is especially useful in regulated industries, field‑work tools, and any app where users expect quick, on‑device responses.
On-device AI doesn’t replace the cloud, it complements it. Here’s how:
Design an app to keep data local by default and surface cloud options transparently with user consent and clear disclosures.
Windows AI Foundry supports hybrid workflows:
1. Only On-Device: Tries Foundry Local first, falls back to ONNX
if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}") return "Error: No local AI available"2. Hybrid approach: On-device first, cloud as last resort
def get_answer(question, context): """ Priority order: 1. Foundry Local (best: advanced + private) 2. ONNX Runtime (good: fast + private) 3. Cloud API (fallback: requires internet, less private) # in case of Hybrid approach, based on real-time scenario """ if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}, trying cloud...") # Last resort: Cloud API (requires internet) if network_available(): try: import requests response = requests.post( '{BASE_URL_AI_CHAT_COMPLETION}', headers={'Authorization': f'Bearer {API_KEY}'}, json={ 'model': '{MODEL_NAME}', 'messages': [{ 'role': 'user', 'content': f'Context: {context}\n\nQuestion: {question}' }] }, timeout=10 ) answer = response.json()['choices'][0]['message']['content'] return answer, source="Cloud API (Online)" except Exception as e: return "Error: No AI runtime available", source="Failed" else: return "Error: No internet and no local AI available", source="Offline"
On-device AI isn’t just a trend - it’s a shift toward smarter, faster, and more secure applications. With Windows AI Foundry and Foundry Local, developers can deliver experiences that respect user specific data, reduce latency, and work even when connectivity fails. By combining local inference with optional cloud enhancements, you get the best of both worlds: instant performance and scalable intelligence.
Whether you’re creating document summarizers, offline assistants, or compliance-ready solutions, this approach ensures your apps stay responsive, reliable, and user-centric.
Hey everyone and welcome to today's episode of Developer Tea. It's been quite a while since I've had a guest on the show. Today, I'm joined by Bryan McCann, CTO at you.com. We dive into a wide-ranging discussion, exploring the philosophical origins of his career—from studying meaning and language to working in very early AI research. This discussion is less advice-heavy and more focused on kind of theory and discussion. I hope this is insightful for you and helpful as you crystallize your own philosophies on these subjects.
If you enjoyed this episode and would like me to discuss a question that you have on the show, drop it over at: developertea.com..
If you want to be a part of a supportive community of engineers (non-engineers welcome!) working to improve their lives and careers, join us on the Developer Tea Discord community by visiting https://developertea.com/discord today!.
If you're enjoying the show and want to support the content head over to iTunes and leave a review! It helps other developers discover the show and keep us focused on what matters to you.
1130. This week, we look at words related to elections, and then I help you remember the difference between "home in" and "hone in" with a tip that includes a shocking historical tidbit about spiders.
🔗 Share your familect recording in a WhatsApp chat.
🔗 Watch my LinkedIn Learning writing courses.
🔗 Subscribe to the newsletter.
đź”—Â Take our advertising survey.Â
🔗 Get the edited transcript.
đź”—Â Get Grammar Girl books.Â
đź”—Â Join Grammarpalooza. Get ad-free and bonus episodes at Apple Podcasts or Subtext. Learn more about the difference.Â
| HOST: Mignon Fogarty
| VOICEMAIL: 833-214-GIRL (833-214-4475).
| Grammar Girl is part of the Quick and Dirty Tips podcast network.
| Theme music by Catherine Rannus.
| Grammar Girl Social Media: YouTube. TikTok. Facebook. Threads. Instagram. LinkedIn. Mastodon. Bluesky.
Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Andrea and Kedasha sit down with data whisperer Jeff Luszcz, one of the wizards behind GitHub’s annual Octoverse report, to unpack this year’s biggest shifts. They get into why TypeScript overtook Python on GitHub, how AI-assisted “vibe coding” and agentic workflows are reshaping everyday engineering, and what it means that more than one new developer joins GitHub every second. From 1.12B open source contributions and 518M merged PRs to COBOL’s unexpected comeback, global growth (hello India, Brazil and Indonesia), and “security by default” with CodeQL and Dependabot, this episode turns the numbers into next steps for your career and your open source projects.
Links mentioned in the episode:
https://github.com/jeffrey-luszcz
https://github.com/features/copilot
https://docs.github.com/code-security/dependabot
https://docs.github.com/code-security/secret-scanning/introduction/about-secret-scanning
https://www.typescriptlang.org
https://marketplace.visualstudio.com/items?itemName=GitHub.copilot
Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.