This study aims to develop a real-time motion recognition system that translates skeletal human movements into a virtual environment. This will be achieved through the use of advanced techniques for the accurate capture of human skeletons and coordinate conversion. This paper investigates the acquisition and processing of motion data for virtual characters using depth cameras to obtain depth information. This study identifies six specific actions: left kick, right kick, left punch, right punch, squatting, and sitting. The experimental process successfully integrated RGB+D cameras, Media Pipe, and OpenCV into Unreal Engine models to capture and display human skeletal and joint positions in real-time. The experimental results show that the system achieved a precision of 100% for all motion detections, with an accuracy of more than 94%. How-ever, the recall rate for specific actions was lower, reaching 88%.