Getting AI Running On-Prem

Most IT Operations teams are curious about AI, but many have not yet brought it into their monitoring workflows. The reasons are understandable:

Security concerns
Sensitive infrastructure data
Uncertainty about where to begin.

The good news is that you can start small. You can run an AI runtime and model entirely inside your own environment, with no external dependency and no need to connect it to SCOM on day one.

What you’ll have at the end

By the end of this first step, you’ll have:

A local AI runtime installed
A working AI model running on-prem
The ability to ask basic IT Operations questions

Think of this as step one: get comfortable running AI locally before connecting it to operational systems.

Step 1: Install a local AI runtime

A simple place to start is Ollama, a lightweight runtime for running large language models locally. It works on Windows, Linux, and macOS, and is easy to test without a major infrastructure project.

Windows quick start:

From an internet connected system download and install Ollama from: https://ollama.com
Open a command prompt
Run your first model

Step 2: Pull and run a model

Once Ollama is installed, run:

ollama run gpt-oss:20b

This downloads the model, starts it locally, and opens an interactive prompt. At this point, you now have an AI model running on-prem. You can disconnect this system from the internet or move it to an air-gapped environment.

Step 3: Ask a few IT Operations questions

Try prompts like:

What are common causes of sustained high CPU on a Windows Server?

What data would you need to diagnose a high CPU alert in SCOM?

Explain the difference between alert noise and actionable alerts.

At this stage, the model does not know your environment and has no access to SCOM. The goal is simply to show that AI can run locally and respond to operational questions.

What to expect

Right now, the model can provide general guidance only. It cannot inspect alerts, query the OperationsManager database, or correlate events until it is connected to tools and data sources. That comes later. First, confirm that your team can run AI safely inside your environment.

Basic hardware expectations

Minimum to experiment

Modern CPU
16–32 GB RAM

Recommended experience

GPU with 8GB+ VRAM
Faster responses
Better support for larger models

You do not need to design a production AI platform on day one. Start small, validate the basics, and expand from there.

Common pitfalls to avoid

Do not start by testing too many models.

Do not expect production-ready answers immediately.

Do not connect AI to sensitive systems before you understand how the runtime behaves.

Start with one runtime, one model, and a few practical IT Operations prompts.

Leave a Reply Cancel reply