On-device LLM Deployment
端侧大模型部署应用
Compressing and quantizing billion-parameter language models for deployment on edge devices. Native offline inference, no cloud dependency — bringing large models to environments where privacy is non-negotiable and connectivity is unreliable.