Choosing a depth camera


#1

One of the projects I’m intending to pursue is the development of an advanced NUI (Natural User Interface) that integrates markerless mocap animation data for locomotion, body positioning and possibly gesture commands.
However, looking at the available tech, I’m faced with a lot of decisions to make before I can confidently chose a solution! Since I’ve set a load of time aside for HiFi, but am still unable to get inworld (see here), I thought I’d write up my research and share what I’ve found here.

There are two starting points:

  1. Look at what’s already supported in HiFi - What’s in here
  2. Look at the available hardware and SDKs - What’s out there

What's in here

Since both the Kinect and the Xtion Pro can already be used by Faceshift, I feel safe presuming that there is support for datastreams from these cameras already. If anyone could give me some detail on how this is done (e.g. which SDK / libraries are being used, which classes implement the functionality) that would be great. I also see the PrioVR is integrated into Interface. It’s good to see that dealing with the data stream doesn’t seem too complex (mapping rotations from input joints to avatar joints), but the PrioVR suit doesn’t really fit my requirements. There’s also Sixense support, but as far as I can see it currently provides support for the hydras only - and anyway, the Sixense STEM body capture is not markerless, so doesn’t fit my idea of a Natural UI either.

What's out there

There’s a huge range of solution combinations out there! I’m having trouble finding an ideal solution though, as matching both an SDK / library and a camera that fit the bill gets a little complex, particularly in terms of supported OSs and licensing issues. Here’s where I am:

Cameras

Purely on specifications, the obvious choice is the Kinect 2, complete with 25 joint skeleton, larger operating distances and the highest camera resolutions. However, the SDK only runs on Windows 8. Similarly, the original Kinect’s SDK only runs on Win7 and Win8.

Looking at alternatives, the Asus Xtion Pro Live and the Primesense Carmine cameras seem to bubble up to the top of the list. The Asus Xtion Pro Live looks particularly good, as it’s SDK incorporates the OpenNI and NiTE middleware libraries (more on those below).

Primesense’s offering isn’t really in the running for me right now, as Apple bought them out last November and quickly took down all the Open Source OpenNI pages that Primesense originally supplied. It’s not clear what Apple will do next, but their decision to take down the web pages supporting a thriving Open Source community doesn’t exactly bode well. (note: Structure have taken up the OpenNI baton, and now host the OpenNI page here with the wry title “The rumors of my death have been greatly exaggerated…”).
Which leads me nicely onto the subject of available SDKs / libraries…

SDKs / libraries

This is where the harder decisions come in, as there doesn’t seem to be an option that fits all requirements:

Primary requirements

~ OS support: Linux, Windows, Mac
~ Easily accessible skeleton stream.
~ Skeleton should have the highest possible number of bones / joints possible
~ No licensing issues

The Microsoft SDKs look like the easiest to get up and running, but of course the lack of OS support puts them straight out of the running for a long term solution.

There are two Open Source solutions worth looking at; OpenNI+SensorKinect and the libfreenect driver from the OpenKinect community, both of which are built on OpenNI.

The Asus Xtion Pro Live’s SDK looks great bar one thing - no Mac support - which I find weird, as I think there are Mac users using the device with FaceShift already - can anyone shed light on this?
The Xtion SDK does support Win 32/64 : XP , Vista, 7, 8, Linux Ubuntu 10.10: X86,32/64 bit and Android (by request). Since there is Linux support, I am holding hope that Mac support may not be too far away, but that’s quite possibly my over-optimism!
The Asus Xtion Pro Live’s SDK has the OpenNI library bundled - ok, I guess it’s time I went into some detail on OpenNI and NiTE.

OpenNI
OpenNI is an industry-led, non-profit organization formed to certify and promote the compatibility and interoperability of Natural Interaction (NI) devices, applications and middleware. (source)
OpenNI supplies, amongst other streams, a skeletal data stream.
OpenNI is now on it’s second version. Whilst version 1 of the OpenNI SDK had good OS support, the OpenNI 2 library, apparently due to Kinect license restrictions only supports Windows, so any HiFi development using OpenNI might have to be run on OpenNI 1.5 or limited to Windows (see the discussion on future Mac / Linux OpenNI 2 support using OpenNI2-Freenect here).

Sadly, it looks like OpenNI’s plan for the Kinect 2 is that it will only be supported by OpenNI 2, which in turn doesn’t support *nix. Bugger.

NiTE
PrimeSense developed the NiTE Middleware, the software that analyzes the data from the hardware, are the modules for OpenNI providing hand and gesture tracking. They are free but not open source, being released only as binaries.
Quoting the Linkedin page: “NiTE identifies users and tracks their movements, and provides the framework API for implementing Natural-Interaction UI controls based on gestures.” The system can then interpret specific gestures, making completely hands-free control of electronic devices a reality. Including:
~ Identification of people, their body properties, movements and gestures
~ Classification of objects such as furniture
~ Location of walls and floor
(source)

If you’re interested to find out more about OpenNi and NiTE, there’s a very detailed youtube explanation with examples here.

An interesting possible leftfield SDK option (or maybe just food for thought) is ZigFu who supply a javascript front end for reading Kinect data as browser plugin (Plugins for Unity and Flash are also available). Whilst it looks like a very, very easy way to connect to a depth camera’s data streams, it only currently supports Kinect cameras, and there is only support for Windows and Mac. Still, it’s an interesting idea, as the development times for js based projects are a fraction of those for C++…

Summary

For now, I’m tempted to use a Kinect for Windows SDK 2.0 / Kinect 2 solution for development purposes in the hope that OpenNI 2 will eventually support other OSs (possible, but licensing issues are a problem) and the Kinect 2 (seems very likely). Of course, I’d be forced to upgrade to Windows 8, which irritates me no end…

I’d be really interested to hear from anyone else who’s got any tidbits of info on the subject of full body tracking. Please note, I am still very new to the project, it’s architecture, codebase and capabilities, so as usual there’s a good chance I’ve spouted some complete bollocks here - really, please do feel free to correct / enlighten me as appropriate!

  • Dave

#2

Here’s a tidbit … Beware the OpenNite license: it is (or at least was when I investigated some time ago) a “Disney” license with some requirement for you to guarantee your software’s only provides PG content, or something like that, i.e., not compatible with open-ended virtual worlds.


#3

@ctrlaltdavid - that is very much worth bearing in mind, cheers. I guess it would be worth my checking the Kinect SDK 1.8 commercial license too, I wouldn’t be surprised if they’ve included something similar.

What a shame when the biggest obstacles seem to be licensing issues…


#4

Ah yes, that reminds me … The Kinect library, at least at some stage, had some clause about your application being Kinect-only, i.e., not allowed to use any “sensor”.


#5

All seems like a fragmented nightmare


#6

Aw crap, it gets worse - I got this reply from the peeps at FaceShift:

we do not support kinect 2, as actually the kinect 2 has worse data quality than the kinect 1 in close range scenarios. It is very unfortunate, in particular as we have a lot of requests for it.

Am slightly confused though - I could be wrong, but I thought people were already using the Kinect 2 with Faceshift in HiFy - has anyone had any experience with this?

I’m never going to be able to chose a camera at this rate!

  • Dave