Exposing AMD GPU to Kubernates

Current Test System Kubernates, Single Node Rancher System You can refer to this link for kubernates singlenode installation https://blog.alphabravo.io/posts/2021/single-node-rke2-pt1/ Reference http://www.bytefold.com/sharing-gpu-in-kubernetes/ https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/ Install kubernates plug-in for ROCM Install the plug in using daemonset Deploy the daemonset (it is a plug in installed on all amd gpu node) its Done,, just like that! Verify the daemonset, […]

Running Langchain and LLM Model on AMD ROCM

System Requirements ROCm Installation Can Refer to this link https://rocm.docs.amd.com/en/latest/ Docker Runtime Installation Can Refer to this link https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository Conda Environment Installation Can Refer to this link (Inside Container) https://vegastack.com/tutorials/how-to-install-anaconda-on-ubuntu-22-04/ Current Test System At this guide I use this following spec : Using Docker Image for ROCm Now after ROCm Installed on the Host OS, […]

AMD GPU Benchmark without Docker

Pre Required :ubuntu 22.04python 3.10.6 Reference :https://rocm.docs.amd.com/en/latest/how_to/tensorflow_install/tensorflow_install.html#../../deploy/linux/install.md Install python pip and rocm package : sudo apt install python3-pipsudo apt install rocm-libs rccl Use Virtual Environment: sudo pip3 install virtualenvvirtualenv psi-venvsource psi-venv/bin/activateInstall tensor flow, tesorflow-rocm and protobuf pip install tensorflow=2.12pip install tensorflow-rocm== –upgradepip install protobuf==3.20.3 Verify Python is Running :python3 -c ‘import tensorflow’ 2> /dev/null && […]

AMD GPU Benchmark

Ubuntu 22.04 Install docker : https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-22-04 Install AMD TF-ROCm Image https://rocm.docs.amd.com/en/latest/how_to/tensorflow_install/tensorflow_install.html#../../deploy/linux/install.md docker pull rocm/tensorflow:latest Run it and enter the docker console docker run -it –network=host –device=/dev/kfd –device=/dev/dri \ –ipc=host –shm-size 16G –group-add video –cap-add=SYS_PTRACE \ –security-opt seccomp=unconfined rocm/tensorflow:latest Do benchmarking total images/sec: 477.72