In this paper, a joint gesture tracking method combining particle filter and mean shift algorithm is proposed to improve the accuracy and robustness of the system. During the slow movement of the human hand, the average movement of the particles is first used to cause most of the particles to drift into the gesture area. In the case where the movement of the human hand is faster or there is occlusion, when the mean shift of the particle is performed, if the region of the gesture is not detected, the particle will return to the state before the drift, and then the next frame is processed. The method can directly calculate the position of the gesture based on the particles used for subsequent testing, and can save the tracking time of the algorithm. Through experimental simulation, compared with the Cam-shift algorithm, when the sampling point of the joint tracking algorithm proposed in this paper is 200, the tracking accuracy is improved to 95.2%. Compared with 90.6% of the Cam-shift algorithm, the tracking time is reduced from 83.7 ms to 25.8 ms. Therefore, the proposed algorithm can greatly improve the tracking accuracy and real-time, and can also effectively reduce the impact of complex environments on the tracking effect.