PulseProbe – Your CLI Companion for Model Visibility on SambaNova Cloud

:books: Context

Modern AI applications often leverage multiple large language models (LLMs) deployed via cloud APIs like SambaNova Cloud. For seamless development and troubleshooting, it’s essential to maintain clear visibility into model endpoint responsiveness and behavior.

That’s where a simple, continuous diagnostic tool becomes useful — helping developers quickly validate endpoint behavior and share actionable output when needed.

:compass: Why PulseProbe?
While full-featured monitoring platforms exist, they may be overkill for everyday development and debugging needs.
** PulseProbe is intentionally lightweight, CLI-friendly, and easy to integrate, designed to help teams validate model interactions and provide quick diagnostics for SambaNova Cloud APIs.

:magnifying_glass_tilted_left: Problem Statement

Teams building with SambaNova Cloud often integrate multiple models across a range of workflows. At times, you may encounter:

  • :red_question_mark: Unclear response behavior during development
  • :test_tube: Inconsistent results in CLI or GUI workflows
  • :man_detective: Challenges replicating conditions during support requests

Without a quick way to test API behavior across models, it’s difficult to:

  • :bullseye: Pinpoint if an issue is model-specific
  • :magnifying_glass_tilted_left: Collect reproducible examples
  • :envelope_with_arrow: Provide actionable info to support teams

:bullseye: Goal

Create a lightweight, script-based diagnostic tool that:

  • Dynamically fetches the latest model list from SambaNova Cloud
  • Cycles through each model with a standardized test prompt
  • Logs success/failure in real-time for quick triage
  • Helps users capture and share API response behavior easily
  • Minimizes resource/credit usage during checks
  • Handles clean exits for easy CLI or scheduled use

:hammer_and_wrench: Solution: PulseProbe

PulseProbe.py is a Python utility to assist with real-time, low-cost diagnostics of LLM APIs on SambaNova Cloud.

Key Features:

  • :locked_with_key: Accepts a SambaNova API key as a CLI argument (no hardcoding)
  • :counterclockwise_arrows_button: Fetches the current active model list via /v1/models
  • :prohibited: Skips test/guard models like Meta-Llama-Guard-*
  • :speech_balloon: Sends a minimal prompt (“Say hello!”) to each model
  • :receipt: Logs results clearly:
  • :white_check_mark: Successful 200 OK
  • :cross_mark: Failures with HTTP code/message
  • :warning: Graceful handling of exceptions or timeouts
  • :repeat_button: Includes a 1-second delay between checks to limit credit use
  • :raised_hand: Supports clean exit on Ctrl+C

:test_tube: How It Works

:counterclockwise_arrows_button: Loop Steps:

  1. Initialize:
  • Load API key from CLI.
  • Set up HTTP headers.
  1. Fetch Models:
  • GET request to https://api.sambanova.ai/v1/models.
  • Filter out any models with "Guard" in their ID.
  1. Send Request:
  • POST to /v1/chat/completions for each model with the prompt:“Say hello!”
  1. Log Result:
  • :white_check_mark: If HTTP 200, print success.
  • :cross_mark: If failure, log error code and reason.
  • :warning: If exception (e.g., timeout), print warning.
  1. Repeat:
  • After all models are tested, print summary and repeat the cycle.

:gear: Prerequisites

  • Python 3.7 or higher
  • requests package

pip install requests

  • Valid SambaNova Cloud API key

:open_file_folder: Code Overview

Filename: pulseprobe.py

Core Modules:

  • requests: For API interactions
  • signal: For handling clean exits
  • time: For delays between requests
  • sys, argparse: For CLI argument parsing

Highlights:

  • Runtime model discovery
  • Simple filtering and test loop
  • Human-readable, real-time logging
  • Useful in CI/CD, dev environments, and issue reporting

:open_book: Usage Example & Code:

Run the script by passing your SambaNova API key as a command-line argument:

python pulseprobe.py <YOUR_API_KEY>

Example:

python pulseprobe.py 34db0ebf-4962-4124-b209-5a3c33f06d4b

Sample Output:

`📥 Found 14 active models (excluding 'Guard').
✅ DeepSeek-R1 responded successfully.
✅ E5-Mistral-7B-Instruct responded successfully.
❌ QwQ-32B error: 503 - Service Unavailable
⚠️ Meta-Llama-3.3-70B-Instruct failed: ReadTimeout
🔄 Completed cycle #1. Continuing...`

Code:

pulseprobe.py

import time
import requests
import json
import signal
import sys

# Constants
API_URL = "https://api.sambanova.ai/v1/chat/completions"
MODEL_LIST_URL = "https://api.sambanova.ai/v1/models"

# Graceful termination
def signal_handler(sig, frame):
    print("\n⛔ Terminated by user.")
    sys.exit(0)

signal.signal(signal.SIGINT, signal_handler)

# Check for API key
if len(sys.argv) != 2:
    print("❗ Usage: python PulseProbe.py <YOUR_API_KEY>")
    sys.exit(1)

API_KEY = sys.argv[1]

HEADERS = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# Fetch latest model list from SambaNova
def fetch_models():
    try:
        response = requests.get(MODEL_LIST_URL, headers=HEADERS)
        response.raise_for_status()
        data = response.json()
        # Skip models that include 'Guard' in their ID
        models = [m["id"] for m in data.get("data", []) if "id" in m and "Guard" not in m["id"]]
        print(f"📥 Found {len(models)} active models (excluding 'Guard').")
        return models
    except requests.RequestException as e:
        print(f"❌ Error fetching model list: {e}")
        sys.exit(1)

# Send prompt to model
def make_request(model):
    payload = {
        "model": model,
        "messages": [
            {"role": "system", "content": "Respond concisely."},
            {"role": "user", "content": "What is the use of memory in AI models?"}
        ],
        "max_tokens": 512,
        "temperature": 0.5,
        "stream": False
    }

    try:
        res = requests.post(API_URL, headers=HEADERS, json=payload)
        if res.status_code == 200:
            print(f"✅ {model} responded successfully.")
        else:
            print(f"❌ {model} error: {res.status_code} - {res.text}")
    except Exception as e:
        print(f"⚠️ Exception for {model}: {e}")

# Main logic loop
def monitor_models():
    models = fetch_models()
    if not models:
        print("⚠️ No models available. Check your credentials or the model API.")
        sys.exit(1)

    idx = 0
    cycle = 1

    while True:
        model = models[idx]
        make_request(model)

        idx = (idx + 1) % len(models)
        if idx == 0:
            print(f"\n🔁 Completed cycle #{cycle}. Continuing...\n")
            cycle += 1

        time.sleep(1)

if __name__ == "__main__":
    monitor_models()

:light_bulb: Use Cases

Use Case Description
Diagnostic Visibility Quickly validate if models are responding as expected.
CLI/GUI Troubleshooting Check response behavior when results are unclear.
Support Ticket Aid Capture logs and share with SambaNova support to streamline triage.
Credit-Conscious Checks Uses minimal prompt + 1s delay to reduce credit cost

:credit_card: API Credits & Usage Notes

  • PulseProbe was designed to run efficiently and continuously without consuming excessive API credits.
  • It uses a lightweight prompt and delay between checks to conserve resources.
  • You can adjust the timing or number of models to further control usage.