Getting AI Running On-Prem
Most IT Operations teams are curious about AI, but many have not yet brought it into their monitoring workflows. The reasons are understandable:
- Security concerns
- Sensitive infrastructure data
- Uncertainty about where to begin.
The good news is that you can start small. You can run an AI runtime and model entirely inside your own environment, with no external dependency and no need to connect it to SCOM on day one.
What you’ll have at the end
By the end of this first step, you’ll have:
- A local AI runtime installed
- A working AI model running on-prem
- The ability to ask basic IT Operations questions
Think of this as step one: get comfortable running AI locally before connecting it to operational systems.
Step 1: Install a local AI runtime
A simple place to start is Ollama, a lightweight runtime for running large language models locally. It works on Windows, Linux, and macOS, and is easy to test without a major infrastructure project.
Windows quick start:
- From an internet connected system download and install Ollama from: https://ollama.com
- Open a command prompt
- Run your first model
Step 2: Pull and run a model
Once Ollama is installed, run:
ollama run gpt-oss:20b
This downloads the model, starts it locally, and opens an interactive prompt. At this point, you now have an AI model running on-prem. You can disconnect this system from the internet or move it to an air-gapped environment.
Step 3: Ask a few IT Operations questions
Try prompts like:
What are common causes of sustained high CPU on a Windows Server?
What data would you need to diagnose a high CPU alert in SCOM?
Explain the difference between alert noise and actionable alerts.
At this stage, the model does not know your environment and has no access to SCOM. The goal is simply to show that AI can run locally and respond to operational questions.
What to expect
Right now, the model can provide general guidance only. It cannot inspect alerts, query the OperationsManager database, or correlate events until it is connected to tools and data sources. That comes later. First, confirm that your team can run AI safely inside your environment.
Basic hardware expectations
Minimum to experiment
- Modern CPU
- 16–32 GB RAM
Recommended experience
- GPU with 8GB+ VRAM
- Faster responses
- Better support for larger models
You do not need to design a production AI platform on day one. Start small, validate the basics, and expand from there.
Common pitfalls to avoid
Do not start by testing too many models.
Do not expect production-ready answers immediately.
Do not connect AI to sensitive systems before you understand how the runtime behaves.
Start with one runtime, one model, and a few practical IT Operations prompts.