Kinect skeleton tracking with Visual Python

Lately I have been experimenting with Microsoft’s Kinect and accessing it from a PC.  The Kinect sensor provides a 640×480 color image and a 320×240 depth image, each at 30 frames per second(*).  With all of that data and with all of the potential applications– even above and beyond the obvious gaming on the Xbox 360– the prospect of working with the Kinect certainly generates a large “wow factor” for programming students.

However, it is not exactly easy to get started.  One of the most challenging (and most interesting) problems is “skeleton tracking”: given a human body in the field of view of the sensor, how to turn all of that depth and color information into a representation of the body?

Fortunately, that hard problem is already solved for us onboard the sensor.  In addition to the color and depth images, the Kinect API also provides tracking information for “skeletons” in the field of view, consisting of 3D positions of 20 “joints” throughout the body, as shown in the following figure:

The 20 joints that make up a Kinect skeleton.

The next problem is how to take this skeleton tracking information and turn it into a useful visualization.  This is where Visual Python (or VPython) comes in.  As I have commented before, I think one of the greatest strengths of VPython is the ease with which students can create reasonably polished 3D scenes, without the complexity of window event loops or the need to separate rendering from business logic.  There is only business logic; rendering is done under the hood.  You simply create/manipulate the objects in the scene, and they appear/move automatically.

My goal was to take a Python wrapper for the Kinect API, and Visual Python, and combine them into a simple Python module that allows students to (1) quickly get up and running with a visualization of skeleton tracking, and (2) make it easier to write their own code to experiment with gesture recognition, interact with other 3D objects in a “virtual world,” etc.

The result is a module called vpykinect; following is example code and some screenshots of using vpykinect to display a tracked skeleton as it stands (or moves) in front of the Kinect sensor:

from visual import *
import vpykinect

vpykinect.draw_sensor(frame())
skeleton = vpykinect.Skeleton(frame(visible=False))
while True:
    rate(30)
    skeleton.frame.visible = skeleton.update()

Screenshots of a Kinect skeleton (that’s me!) facing the sensor. All images are rotated and/or zoomed views of the same still scene.

Finally, following are instructions for installing all of the necessary software:

  1. Install the Kinect SDK 1.0 here
  2. Install Visual Python here: http://www.vpython.org/
  3. Install PyKinect 1.0 here: http://pypi.python.org/pypi/pykinect/1.0  (I chose PyKinect simply because it required zero setup.  Either libfreenect or OpenNI would work equally well, at least for the Xbox version of the Kinect that they support.)
  4. Run or import my vpykinect module here: https://sites.google.com/site/erfarmer/downloads

The vpykinect module is currently of the “dumbest thing that could possibly work” variety.  For me, it was an experiment in just how little code I had to write to get something useful working.  Because of this, it has at least a couple of obvious limitations:

  1. It does not directly include the constants for indexing into the list of individual joints.  These are available in the PyKinect module as JointIds (e.g., the position of the right hand is skeleton.joints[vpykinect.JointId.HandRight.value].pos, which is more verbose than it needs to be)… but this could also be an interesting cooperative exercise: for example, students could systematically change the color of individual joints (i.e., spheres) to work out the mapping themselves.
  2. It does not support tracking multiple skeletons simultaneously.  Again, the PyKinect module supports this, but I did not include it here for simplicity.

(*) Higher resolutions and more capabilities are available, especially with the newer Kinect for Windows sensor, which differs in some respects from the original Kinect for Xbox 360.

Edit: In response to questions from Kristians in the comments, following is another example program demonstrating simple gesture recognition.  This program recognizes “waves” of your right hand, where a wave consists of raising your right hand above your shoulder.

from visual import *
import vpykinect

skeleton = vpykinect.Skeleton(frame(visible=False))
skeleton.frame.visible = False
raised = False

while True:
    rate(30)
    skeleton.frame.visible = skeleton.update()
    if skeleton.frame.visible:
        right_hand = skeleton.joints[11]
        right_shoulder = skeleton.joints[8]
        spine = skeleton.joints[1]
        if right_hand.y > right_shoulder.y and not raised:
            raised = True
            print('Recognized right hand wave.')
        elif right_hand.y < spine.y and raised:
            raised = False

Note the hard-coded indexing into the list of joints mentioned in the limitations above.  Also, we must “debounce” the recognition of each wave, by requiring the hand to be lowered farther than the threshold for raising it.  I chose the “spine” joint as a convenient threshold.

This entry was posted in Uncategorized. Bookmark the permalink.

45 Responses to Kinect skeleton tracking with Visual Python

  1. Thomas says:

    Hey nice article, but is there anyway for you to maybe make a “how-to-guide” on what to do after you download the programs, for people that don´t have that much programming experience, like what compiler too use? like i´m using eclipse right now to do python programming, would it be possible for you to make a guide on what to do using that compiler?

    • Hmmm. I may misunderstand your question, but it is perhaps worth clarifying a few things. First, you don’t need a compiler for any of this. That is, having installed the Kinect SDK, Visual Python, and PyKinect, you can access the sensor via “pure Python” code, of which my vpykinect.py module is one example.

      On the other hand, if by “compiler” you actually just meant “Python IDE,” then perhaps I can help. Can you describe at which of the 4 steps in the above instructions you are stuck?

  2. Kristians says:

    Hey, I’m working with kinect on a project to make personal training assistant, and i’m having trouble with understanding the xyz values, so i could implement counting repetitions (everytime you do the same move, for example simple dumbbell lift, it counts it, and also displays count). The problem is that i don’t now where to get those xyz values to set some kind of trigger, that would count that rep.
    Basically i don’t understand how to read that i just lifted up my right hand and write a counter for it
    I could really use some help.

    • There seem to be two separate questions here, I’m not sure which/both of them you are asking about:

      1. Given an (x,y,z) position of a particular joint, such as your right hand, what is the coordinate system used? That is, what do the x-, y-, and z-values mean? If this is one of your questions, then check out the figure on the right in the original post, and imagine that you are standing in front of and facing the sensor. The origin (0,0,0) of the coordinate system is located at the Kinect sensor itself. The positive z-axis extends straight out in “front” of the sensor, so that an object behind you would have an even *larger* z-coordinate. The positive x-axis extends in the direction of your right hand, and the positive y-axis extends up. The units of (x,y,z) values are in meters.

      2. How do you “read” the (x,y,z) coordinates of a joint in the first place? If this is your question, then the answer depends on what “level” of interface you plan to use. Are you using the Kinect SDK directly, i.e. writing C++ code? Or are you using the PyKinect module (step 3 in the installation instructions above), which is essentially just a Python wrapper around the lower-level C++ calls to the Kinect SDK? Or are you using my vpykinect module (step 4 above), which is in turn just a Visual Python wrapper around calls to the PyKinect module?

      • Kristians says:

        I’m using your Vpython module and trying to implement the counting thing in that script and I need to get some values on xyz, so that when right hand goes up, it adds +1. I don’t know how to make the hand gesture recognition to work. Also I tried writing xyz to file, but that didn’t work out, just printing to screen.

        I want the recognition tool work for only 4 joints that are in both hands. Sorry, English isn’t my native + i lack quite a lot of knowledge about kinect and it is pretty hard to formulate the question right. How to keep track of joint in coords, so i can see the values, get Δx or something that would be the key value for counting?

      • I edited the post to include a working example of the gesture recognition that you describe. As you can see in the program, you can inspect the coordinates of a joint as right_hand.x, right_hand.y, and right_hand.z (or collectively as a vector via right_hand.pos).

  3. Kristians says:

    Thank you very much for the example, it makes some things much clearer. I’ll look into this today and try to maybe implement some new gestures. But how do you know which joint ID equals which body joint?

    • Good question; this is one of the two limitations mentioned in the original post. I left this out intentionally since I use this module in a programming class, to raise exactly this question. One way to figure out the joint IDs (in a classroom setting, for example) is described in the original post: expand this program to incrementally change the color of each joint in turn, and note the resulting changes in the VPython display window. For example, something like:

      for joint in skeleton.joints:
      joint.color = color.red
      input(‘Press Enter to highlight next joint…’)

      Of course, a faster way would be to inspect the PyKinect module source that determines the mapping. If you look in pykinect/nui/structs.py, the class JointId specifies the constants for each of the 20 joints.

  4. is it possible to use pykinect withouth vpython by using normal version of python like python2.6 under windows?
    I’ma python programmer but i do not use Visual python. I program in python 2.6 and python 2.7 under windows. Can I use this module? Thanks

    • If you mean the PyKinect 1.0 module in step 3 of the setup instructions in the post, then yes, you can certainly use that module directly without Visual Python. If it’s helpful, you can look at the code in *my* vpykinect.py module (step 4 of the setup instructions) for examples of how to use it– skipping the VIsual Python stuff.

  5. Pingback: CopyQuery | Question & Answer Tool for your Technical Queries

  6. ajit says:

    can anyone plz tell me how to start building kinect apps from raw actually i am planning to fix kinect app as my final project for my masters i have experience in python programming specially games but i am not getting point from where to start .(i have visited openkinect.com but got confused ) plz help me

  7. lili says:

    I need to save the skeletal joints to be saved for offline data analysis, specially for gesture recognition. Would you mind explaining how to save the related data organized in a text file as an example?
    Thanks

    • I don’t think you would need to explicitly save to a text file; you can simply print the desired values to stdout, and pipe to a text file. The code inside the while-loop would look something like this:

      for join in skeleton.joints:
      print(‘{} {} {}’.format(joint.x, joint.y, joint.z))

  8. Philipp says:

    Great tutorial, thank you! I would like to add that the link to the Kinect SDK is “outdated”. The link now points to the Kinect 2.0 SDK. This tutorial works with the Kinect 1.0 SDK which can be found here: http://www.microsoft.com/en-us/download/details.aspx?id=28782

    • Thanks for this– I have updated the post to reflect the new link. I see that PyKinect has also been updated to 2.1, so it’s probably time to see how much still works with these new versions.

  9. max says:

    This is great.
    Is possible to run on raspberry pi?

    • That’s a good question. I think the answer is probably no– a quick web search yields that (1) Visual Python requires more graphics horsepower than the Raspberry Pi can provide, and (2) a couple of open source libraries provide access to the camera and depth data from the Kinect, but not the skeleton joints, which is what you would need to couple these together.

  10. dayna.larson22@gmail.com says:

    How do you grab x,y,z position data from the skeleton’s right hand?

    • Given skeleton = vpython.Skeleton(…), the member skeleton.joints is a list of Visual Python sphere() objects indicating the position of each corresponding joint. For example, the left and right hands are skeleton.joints[7] and skeleton.joints[11], respectively. VPython 3D graphics objects like sphere() have a .pos member that is a 3D vector indicating their position.

      So, for example, skeleton.joints[11].pos.x is the x-coordinate of the skeleton’s right hand.

  11. Dayna says:

    Another question: I am looking to modify the skeleton to only show the skeleton’s right arm. I have eliminated all of the joint IDs except [2,8] [8,9] [9,10] [10,11] but the joints are still showing up. How else do I modify the script to allow me to do this? Thanks!

    • The skeleton is made up of a collection of 20 “joints,” each represented as a sphere() object, and some number of “bones,” each represented as a cylinder() object drawn between a pair of joints. From your comment, it sounds like you have only edited the _bone_ids list, which will remove the appropriate cylinders… but all 20 of the joint spheres are still there, right? (See lines 21-22.)

      Because of the way the underlying pykinect.nui interface works, the easiest way to “remove” unwanted joint spheres is to not actually remove them, but just make them invisible. For example: for i in [0, 1, 3, 4, 5, 6, 12, 13, 14, 15, 16, 17, 18, 19]: self.joints[i].visible = False

  12. Kamel says:

    HI, I got this error when running vpykinect.py :
    _____________
    Traceback (most recent call last):
    File “C:\Users\Kamel\Desktop\Python Skeleton Kinect\vpykinect.py”, line 79
    _kinect = nui.Runtime()
    File “C:\Users\Kamel\Desktop\Python Skeleton Kinect\pykinect\nui\__init__.py”, line 114, in __init__
    raise KinectError(‘Unable to create Kinect runtime ‘+ traceback.format_exc())
    KinectError: Unable to create Kinect runtime Traceback (most recent call last):
    File “C:\Users\Kamel\Desktop\Python Skeleton Kinect\pykinect\nui\__init__.py”, line 109, in __init__
    self._nui.NuiInitialize(nui_init_flags)
    File “C:\Users\Kamel\Desktop\Python Skeleton Kinect\pykinect\nui\_interop.py”, line 155, in NuiInitialize
    _NuiInstance._NuiInitialize(self, dwFlags)
    File “_ctypes/callproc.c”, line 945, in GetResult
    WindowsError: [Error -2097086443] Windows Error 0x83010015
    _____________
    sorry i’m a little bit noob :/

    • This error is in the initialization of pykinect; which version of this library are you using? Note that vpykinect was developed and tested with pykinect 1.0, but I see that the current latest version is 2.1, which I haven’t investigated yet and may not be compatible.

      • Kamel says:

        I’ve checked it again and i’m using Pykinect version 1.0…
        i’m using Kineckt SDK1.8 could that be the problem?
        Thanks in advanced.

  13. @Kamel, again, this has only been tested using SDK 1.0. Try it with 1.0, and if it works, then it’s definitely incompatibility of the NUI runtime. (Note that we aren’t really even “inside” vpykinect yet, the problem you are observing is getting PyKinect and the SDK to talk to each other.)

  14. Duarte says:

    Hello I am trying to catch the movement values ​​and turn into corners to move a robotic arm , any tips on how to capture the rotation values ​​of the joints to convert to angle ? Thank you

    • Interesting question. There are at least a couple of issues here: the first is noise. Note that the software described in this post only provides the “raw” 30 Hz joint position data as provided by the Kinect sensor. As you can see if you have experimented with this, the joints can “jump around” quite a bit. You will want to filter these positions, probably very heavily, to ensure smooth motion of your robot arm.

      The second issue is degrees of freedom. That is, how many gimbal axes are you controlling? Using a human arm as an example, if you hold your arm still, and twist your forearm/wrist back and forth, but without changing the *position in space* of your hand, elbow, or shoulder, does your robot arm permit that twisting motion? If so, the Kinect joints don’t contain that information, they only indicate the *position in space* of each joint.

      Having said that, if you merely want to be able to control the “bending” motion at the elbow, and not the “twisting” motion of the forearm, then the answer is relatively simple: the angle formed at the elbow between the upper arm and the forearm can be found using the dot product of the corresponding vectors extending from the elbow joint.

      • Duarte says:

        Hi thanks for the reply, so i actually had that issue about the joint “jump” sometimes. I’ll filter that considering close last values and cut some improbable variation ( thats the plan i hope figure out a way).
        How can i get the vectors values like you said : “the angle formed at the elbow between the upper arm and the forearm can be found using the dot product of the corresponding vectors extending from the elbow joint.”
        Can you give a sample maybe ?
        Thanks a lot

  15. @Duarte, using the code at the bottom of this post as a guide, here is a pastebin showing how to (1) retrieve the positions of the (right) shoulder, elbow, and wrist; (2) convert to unit vectors from the elbow; and (3) compute the angle (in radians) formed at the elbow.

  16. Duarte says:

    hello I tested your code and it worked perfectly i got some more questions.

    elbow_to_shoulder = norm(shoulder.pos – elbow.pos)
    i want just the elbow angle variation here to convert to degrees and send to a servo motor

    elbow_to_wrist = norm(wrist.pos – elbow.pos)
    and also the same here just the shoulder degrees to send to another servo motor

    angle_radians = acos(dot(elbow_to_shoulder, elbow_to_wrist))
    here you mix both information ?

    Sorry for asking so much

    • Michael Mendelsohn says:

      If you really want to understand this, you should consult an introduction to analytic geometry – any explanation that is in scope with the comment section of a blog post is going to be unsatisfactory.

      In short, elbow_to_shoulder represents the direction of the upper arm, elbow_to_wrist represents the direction of the lower arm, and the third line uses those directions to compute the angle between the lower and the upper arm = the elbow angle.

  17. Miranda Roberts says:

    Hi,

    How do I get your ‘wave’ code to run? I am trying to run it but nothing is happening
    Should I paste the code into your vpykinect code or as a separate file?

    • When you say “nothing is happening,” can you describe the symptoms in more detail? That is, (1) does the previous example (before the waving recognition stuff) work? (2) When you run the code, do you see error messages? (3) Does the VPython graphics window show a responsive skeleton but just no recognition of when you “wave,” etc.?

      The vpykinect.py is intended as a module to be imported; the example code in the post should be run as a separate file.

      • Miranda Roberts says:

        Thank you for getting back to me.

        1) So I can run the previous code and see a responsive skeleton, that’s fine.
        2) I don’t get a error message.
        3) Yes, precisely that.There’s no response at all.

        I’m basically trying to create some data sets of some gestures such as walking, sitting, jumping etc. which uses the x,y,z co-ordinates of each joint and putting them in a .txt file for each gesture. So the Kinect records all the joints of the skeleton and puts the co-ordinates in a file.
        I thought your ‘wave’ code would give me a guide but if you could help in any way, that would be greatly appreciated!

        Thank you so much!

  18. @Miranda, well, hmmm. Given your description of the problem, I’m not sure where the cause might be. My recommendation would be to insert some print() statements in the while loop to inspect the coordinates of the hand, the shoulder, etc., and see if they behave as expected as you move the skeleton around. Then see if the appropriate conditions are satisfied for a “wave” (i.e., does the hand move above the shoulder?), and if not, what *are* the coordinates of the hand and shoulder when you *do* wave?

    • Columbus says:

      It could be because the second code is missinge “vpykinect.draw_sensor(frame())” line.

      • Miranda Roberts says:

        And where would this line be placed in the code?

      • Columbus says:

        Some like this:

        from visual import *
        import vpykinect

        vpykinect.draw_sensor(frame())
        skeleton = vpykinect.Skeleton(frame(visible=False))
        skeleton.frame.visible = False
        raised = False

        while True:
        rate(30)
        skeleton.frame.visible = skeleton.update()
        if skeleton.frame.visible:
        right_hand = skeleton.joints[11]
        right_shoulder = skeleton.joints[8]
        spine = skeleton.joints[1]
        if right_hand.y > right_shoulder.y and not raised:
        raised = True
        print(‘Recognized right hand wave.’)
        elif right_hand.y < spine.y and raised:
        raised = False

  19. Columbus says:

    I’m working on a way to measure the angle of the knee but I’m having a lot of variation. Does anyone have any idea how to improve accuracy?

    from visual import *
    import vpykinect
    import numpy as np

    def get_angles(KneeLeftPos, HipLeftPos, AnkleLeftPos):
    trans_a = HipLeftPos – KneeLeftPos
    trans_b = AnkleLeftPos – KneeLeftPos
    angles = np.arccos(np.sum(trans_a * trans_b, axis = 0)/(np.sqrt(np.sum(trans_a ** 2, axis = 0)) * np.sqrt(np.sum(trans_b ** 2, axis = 0))))

    return (np.pi – angles) * (180/np.pi)

    vpykinect.draw_sensor(frame())
    skeleton = vpykinect.Skeleton(frame(visible=False))
    while True:
    rate(30)
    skeleton.frame.visible = skeleton.update()
    if skeleton.frame.visible:
    # LEFT SIDE
    HipLeft = skeleton.joints[12]
    KneeLeft = skeleton.joints[13]
    AnkleLeft = skeleton.joints[14]

    HipLeft.color = color.green
    KneeLeft.color = color.red
    AnkleLeft.color = color.green

    HipLeftPos = np.array([HipLeft.x, HipLeft.y, HipLeft.z])
    KneeLeftPos = np.array([KneeLeft.x, KneeLeft.y, KneeLeft.z])
    AnkleLeftPos = np.array([AnkleLeft.x, AnkleLeft.y, AnkleLeft.z])

    l_a = get_angles(np.array(KneeLeftPos).T, np.array(HipLeftPos).T,
    np.array(AnkleLeftPos).T)

    # RIGHT SIDE
    HipRight = skeleton.joints[16]
    KneeRight = skeleton.joints[17]
    AnkleRight = skeleton.joints[18]

    HipRight.color = color.green
    KneeRight.color = color.red
    AnkleRight.color = color.green

    HipRightPos = np.array([HipRight.x, HipRight.y, HipRight.z])
    KneeRightPos = np.array([KneeRight.x, KneeRight.y, KneeRight.z])
    AnkleRightPos = np.array([AnkleRight.x, AnkleRight.y, AnkleRight.z])

    r_a = get_angles(np.array(KneeRightPos).T, np.array(HipRightPos).T,
    np.array(AnkleRightPos).T)

    print (‘\nLeft Angle %.1f: ‘ % float(l_a))
    print (‘\nRight Angle %.1f: ‘ % float(r_a))

  20. Yogesh Pariyar says:

    I am trying to figure out how to display only the top half of the skeleton. I am able to get rid of the bones but not the joints spheres. Any help would be much appreciated. Thank you

    • Columbus says:

      To show off the joint spheres:

      for i in [12, 13, 14, 15, 16, 17, 18, 19]: self.joints[i].visible = False

      where,
      skeleton.joints = [
      HipCenter = 0,
      Spine = 1,
      ShoulderCenter = 2,
      Head = 3,
      ShoulderLeft = 4,
      ElbowLeft = 5,
      WristLeft = 6,
      HandLeft = 7,
      ShoulderRight = 8,
      ElbowRight = 9,
      WristRight = 10,
      HandRight = 11,
      HipLeft = 12,
      KneeLeft = 13,
      AnkleLeft = 14,
      FootLeft = 15,
      HipRight = 16,
      KneeRight = 17,
      AnkleRight = 18,
      FootRight = 19,
      ]

      And to show off the bones you can edit the “_bone_ids” on vpykinect.py.
      Ex.:
      # A bone is a cylinder connecting two joints, each specified by an id.
      _bone_ids = [[0, 1], [1, 2], [2, 3], [7, 6], [6, 5], [5, 4], [4, 2],[2, 8], [8, 9], [9, 10], [10, 11]]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s