Prerequisites
• Cursor installed
• LM Studio installed
• ngrok installed
• A local machine capable of running an LLM
For this guide, we’ll use the model: zai-org/glm-4.6v-flash
Step 1: Install LM Studio
Download and install LM Studio from the official website.
Once installed, launch the application.
https://lmstudio.ai/
Step 2: Download a Model
Inside LM Studio, download the model you want to use.
For this article, we use: zai-org/glm-4.6v-flash
Make sure the download completes successfully before moving on.
Step 3: Install ngrok
Install ngrok on your system.
ngrok allows you to expose your local server to the internet with a public URL.
https://ngrok.com/
If you have Homebrew, you can install ngrok easily
brew install ngrok
Step 4: Set Up ngrok
Complete the ngrok setup by following the instructions on the official ngrok Setup page.
This typically includes:
• Creating an ngrok account
• Authenticating your local installation with an auth token
ngrok config add-authtoken {your_token}
3 Step 5: Start the Local Server in LM Studio
1.Open LM Studio
2.Enable Developer Mode
3.Start the local server
LM Studio will now serve your local LLM using an OpenAI-compatible API.
Step 6: Expose the Local Server with ngrok
Open a terminal and run the following command:
ngrok http 1234
Note: 1234 should match the port used by LM Studio’s local server.
After running the command, ngrok will display a public URL like:
https://yours.ngrok-free.app
Copy this URL — you’ll need it for Cursor.
You'll see something like this
🚪 One gateway for every AI model. Available in early access *now*: https://ngrok.com/r/ai
Session Status online
Account your_account (Plan: Free)
Version 3.35.0
Region United States (us)
Latency 19ms
Web Interface http://127.0.0.1:4040
Forwarding https://something.ngrok-free.app -> http://localhost:1234
Connections ttl opn rt1 rt5 p50 p90
7 0 0.00 0.00 6.26 263.91
HTTP Requests
-------------
20:10:37.113 EST POST /v1/chat/completions 200 OK
20:06:13.115 EST POST /v1/chat/completions 200 OK
20:04:59.112 EST POST /v1/chat/completions 200 OK
20:04:42.221 EST POST /v1/chat/completions 200 OK
20:03:11.002 EST POST /v1/chat/completions 200 OK
20:03:05.636 EST POST /v1/chat/completions 200 OK
20:02:22.796 EST POST /v1/chat/completions 200 OK
Step 7: Open Cursor Settings
Launch Cursor and navigate to:
Settings → Models / OpenAI Configuration
Step 8: Configure the OpenAI Base URL
1.Enable OpenAI API Key
2.Enter any placeholder value for the API key
•In this example, we simply use: 1234
3.Paste the ngrok URL into Override OpenAI Base URL
4.Append /v1 to the end of the URL
Your final URL should look like this:
https://yours.ngrok-free.app/v1
Step 9: Add a Custom Model
1. Click Add Custom Model
2. Enter a name for your local LLM Example: GLM4.6-local
⚠️ Windows users:
You must enter the exact model name that LM Studio reports internally.
For this case, zai-org/glm-4.6v-flash
Done! 🎉
That’s it — the setup is complete.
You can now open Cursor Chat, enter a prompt, and send it. Cursor will route the request through ngrok to your local LLM running in LM Studio.
This setup allows you to enjoy Cursor’s powerful coding experience while keeping inference fully local.
Final Thoughts
Using Cursor with a local LLM is a great way to:
• Reduce API costs
• Improve privacy
• Experiment with custom or open-source models
LM Studio and ngrok make the process surprisingly straightforward. Once configured, it feels almost identical to using a hosted OpenAI model — except everything runs on your own machine.
Happy hacking! 🚀











youtu.be/MFsYaRnrcPQ?… yessss
A Type-Safe, Purely Functional Effect System for Asynchronous and Concurrent F#
