GSoC 2025

Intelligent NRP + Kubernetes Routing & Management System

An intelligent router that classifies user input as command or explanation and dispatches accordingly, integrating NRP LLMs with Kubernetes operations.

Official GSoC 2025 Project

Project ID: fPp1JXbl

👤 Contributor

Manish K Reddy

GSoC 2025 Participant

🏛️ Mentor Organization

UC OSPO

University of California Open Source Program Office

🚀 Program

Google Summer of Code 2025

Open Source Research Experience (OSRE)

📋 Official Project Description

Develop an intelligent router that classifies user input as command (kubectl operations) or explanation (documentation/guidance) and dispatches accordingly. The system integrates NRP LLMs with Kubernetes to provide both actionable operations and contextual answers, delivering a clean, extensible Python package suitable for real cluster use and future observability features.

Machine Learning Kubernetes Python CLI Tools Cloud Computing DevOps NLP
2025 Google Summer of Code
5 Core Components
100% Working System

Project Goals

The primary objective was to develop an intelligent router that classifies user input as command (kubectl ops) or explanation (docs/guidance) and dispatches accordingly, integrating NRP LLMs with Kubernetes to provide both actionable operations and contextual answers.

High-level Objectives

  • Build an intelligent router that classifies user input as command (kubectl ops) or explanation (docs/guidance) and dispatches accordingly
  • Integrate NRP LLMs with Kubernetes to provide both actionable operations and contextual answers
  • Deliver a clean, extensible Python package suitable for real cluster use and future observability features

What I Built

Intent Router

intelligent_router.py: LLM-aided + keyword fallback classification into COMMAND, EXPLANATION, or UNCLEAR.

AI/ML Routing

K8s Ops Module

systems/k8s_operations.py: CRUD for pods/deployments, list/describe, logs/exec; defaults to gsoc namespace.

Kubernetes Operations

Interactive Shell & CLI

cli.py: one-shot or chat-style interactive usage.

CLI User-Friendly

Modular Package Structure

Environment templating (config/default.env), clean separation of core/system logic, and cache isolation.

Architecture Modularity

LLM Integration

core/nrp_init.py: NRP API setup, model selection (e.g., gemma3) with graceful error handling + fallbacks.

LLM Integration

Architecture & Components

nrp_k8s_system/
├── intelligent_router.py      # Routing: classify → dispatch (command/explanation)
├── cli.py                     # CLI entry points (single-shot & REPL)
├── core/
│   └── nrp_init.py            # NRP LLM init / model config
├── systems/
│   ├── k8s_operations.py      # K8s CRUD ops, logs, exec, list/describe
│   └── qain.py                # (optional) doc-answering scaffolding
├── config/
│   └── default.env            # Env template (NRP creds, model, base URL)
├── requirements.txt
└── pyproject.toml

Data Flow

User input → intelligent_router (LLM + heuristics) →
COMMANDk8s_operations (Python K8s client)
EXPLANATION → NRP LLM answers with contextual guidance
UNCLEAR → safe defaults & clarification prompts

Project Timeline

Planning

Architecture design and requirements gathering

Core Development

Router and K8s operations implementation

Integration

NRP LLM integration and testing

Deployment

Testing on Nautilus cluster

Current State

Status: Shipped - a working router + K8s ops package tested on Nautilus (gsoc namespace)

✅ Working Features

  • Robustness: LLM unavailable? Fallback classifier keeps command path operational
  • Docs & Examples: README includes usage patterns, examples, and troubleshooting
  • Intelligent router with LLM + keyword fallback
  • Complete K8s operations (CRUD, logs, exec)
  • Interactive CLI and one-shot commands
  • NRP LLM integration with graceful fallbacks

What's Left / Next Steps

Resource Expansion

Broaden resource coverage: Jobs, StatefulSets, ConfigMaps, Services (advanced), PVCs

Observability

Observability integrations: Prometheus queries, DCGM GPU telemetry, alerting hooks

Safety Features

Uncertainty handling: add Conformal Prediction / CROQ loops for safer tool-calling

CI/CD

CI/CD: unit/integration tests and GitLab pipelines

Demo Scenarios

Command Path Examples

# List resources
python -m nrp_k8s_system.intelligent_router "list my pods"

# Create deployment
python -m nrp_k8s_system.intelligent_router "create deployment web image=nginx replicas=3"

# Inspect logs
python -m nrp_k8s_system.intelligent_router "logs web-abcdef-12345"

Explanation Path Examples

python -m nrp_k8s_system.intelligent_router "How do I request A100 GPUs?"
python -m nrp_k8s_system.intelligent_router "What are storage best practices on Nautilus?"

✅ Works out-of-the-box with .env and a configured kube context.

Challenges & Learnings

Multi-tenant K8s Safety

Navigating RBAC/service accounts & default namespaces cleanly required careful design of permissions and namespace isolation.

LLM + Ops Integration

Designing a router that stays useful even when the LLM is slow/unavailable required robust fallback mechanisms and graceful degradation.

DX & Extensibility

Keeping the package simple to install, configure, and extend while maintaining powerful functionality required careful architectural decisions.

Frequently Asked Questions

The intelligent router uses a combination of LLM classification and keyword fallbacks to determine whether user input is a command (requiring K8s operations) or an explanation request (requiring documentation/guidance).
The system includes robust fallback mechanisms using keyword-based classification to ensure command operations continue working even when the LLM service is unavailable.
Currently supports pods and deployments with CRUD operations, logs, and exec functionality. Plans include expanding to Jobs, StatefulSets, ConfigMaps, Services, and PVCs.

How to Use / Reproduce

Prerequisites

Python 3.10+ kubectl NRP API Access

Quick Start

# 1) Install
pip install -r requirements.txt
# or
pip install -e .

# 2) Configure
cp config/default.env .env
# then edit .env:
# NRP_API_KEY=YOUR_KEY
# NRP_BASE_URL=https://llm.nrp-nautilus.io/
# NRP_MODEL=gemma3

# 3) Run one-shot commands
python -m nrp_k8s_system.intelligent_router "list my pods"
python -m nrp_k8s_system.intelligent_router "create deployment web image=nginx replicas=3"
python -m nrp_k8s_system.intelligent_router "How do I request GPUs?"

# 4) Interactive mode
python -m nrp_k8s_system.intelligent_router

✅ Defaults to the gsoc namespace. Works with local kubeconfig or in-cluster config.

Mentor Organization

UC OSPO - University of California Open Source Program Office

Bolstering academic research through open source

The UC OSPO Network is a groundbreaking initiative that harnesses the collective power of six UC campuses to revolutionize open-source practices in academia. UC Santa Cruz OSPO serves as a mentor organization in Google Summer of Code 2025, supporting students through the Open Source Research Experience (OSRE) program.

🎓 Project Context & Mentorship

This project was developed as part of the National Research Platform (NRP), a distributed cyberinfrastructure for scientific computing. Work was conducted under the guidance of mentors from the San Diego Supercomputer Center (SDSC), with Mohammad Firas Sada providing direct technical mentorship on distributed systems architecture and Kubernetes integration.

🎯 Mission

Institutionalize open source practices across the University of California system while providing students hands-on experience with expert mentors.

📊 Impact

Summer 2024 OSRE supported 40 students working on Open Source and Reproducibility projects through GSoC and NSF FAIROS RCN program.

🌐 Network

Collaboration across UC campuses: Santa Cruz, Berkeley, Davis, Los Angeles, Santa Barbara, and San Diego, supported by Alfred P. Sloan Foundation.

Open Source Research OSRE Program Reproducibility Student Mentorship Multi-Campus Network

Research Infrastructure

National Research Platform (NRP)

A distributed cyberinfrastructure supporting scientific computing across research institutions. This project contributes to NRP's mission of providing seamless access to computational resources through intelligent routing and management systems.

Cyberinfrastructure Scientific Computing Distributed Systems

San Diego Supercomputer Center (SDSC)

Leading computational research facility providing advanced cyberinfrastructure and expert mentorship. SDSC researchers guided this project's development, ensuring alignment with production-scale research computing needs.

High Performance Computing Research Computing Mentorship

🔬 Research Impact

This intelligent routing system addresses real challenges in research computing environments, where users need both operational control and educational guidance when working with complex Kubernetes-based scientific workflows on the NRP infrastructure.

Acknowledgments

🎓 Research Mentorship

Mohammad Firas Sada (SDSC) - For exceptional technical mentorship and guidance on distributed systems architecture throughout the project

🌐 Platform Infrastructure

National Research Platform - For providing the distributed cyberinfrastructure context and real-world research computing environment

🤝 Community

Nautilus Community - Community members who helped validate RBAC configurations and cluster access patterns

🌟 Program

Google Summer of Code 2025 - For this incredible opportunity to contribute to open source research