# OnPremize > On-prem AI for enterprise codebases > Detailed version: [llms-full.txt](https://onpremize.com/llms-full.txt) > Last-Updated: 2026-06-05T21:49:17.884Z OnPremize deploys AI code intelligence inside your network. It combines retrieval-augmented generation (RAG) over your codebase with LoRA fine-tuning so models learn your architecture, patterns, and conventions. ## Capabilities - **Code-Aware RAG**: Hybrid dense + sparse search over indexed repositories. Returns contextually relevant code snippets for AI-assisted Q&A. - **LoRA Fine-Tuning**: Train lightweight adapters on your codebase. Supports Qwen, StarCoder2, and other model families via a port-and-adapter architecture. - **OpenAI-Compatible API**: Drop-in replacement for existing tooling. Supports both OpenAI and Anthropic API formats. - **Deployment Modes**: On-premise bare metal, Kubernetes, VPC, and fully air-gapped environments. No data leaves your network. - **Agent Workflows**: Multi-step tool-calling traces and architecture Q&A dataset generation for continuous model improvement. ## Deployment Runs on Linux and Kubernetes. Requires a GPU for inference and training. Uses Qdrant for vector storage, BGE-M3 for embeddings, and supports 4-bit quantization for reduced GPU memory. ## Primary Sources - [Air-Gapped AI Code Assistant](https://onpremize.com/solutions/air-gapped-ai-code-assistant) - [On-Prem RAG for Source Code](https://onpremize.com/solutions/on-prem-rag-for-source-code) - [LoRA Fine-Tuning for Private Code](https://onpremize.com/platform/lora-fine-tuning-for-private-code) - [On-Prem AI Governance](https://onpremize.com/security/on-prem-ai-governance) - [Machine-Readable Entity Facts](https://onpremize.com/ai-entity.json) ## Pages - [Home](https://onpremize.com) - [Privacy Policy](https://onpremize.com/privacy) - [Terms of Service](https://onpremize.com/terms) - [Brand Kit](https://onpremize.com/brand) ## Contact - General: ping@onpremize.com - Sales: ping@onpremize.com - Security: ping@onpremize.com