We'll focus on customizing foundation large language models (LLMs) for languages other than English. We'll go through techniques like prompt-engineering, prompt-tuning, parameter-efficient fine-tuning, and supervised instruction fine-tuning (SFT), enabling LLMs to adapt to diverse use cases. We'll showcase some of these techniques using NVIDIA NeMo Framework for both NVIDIA Foundation Models and other community models, such as Llama-2. Finally, we'll demonstrate how to efficiently deploy the customized models using NVIDIA TensorRT-LLM and NVIDIA Triton Inference Server.