[Noisebridge-discuss] Build advice for a new system / heavy cluster GPU AI processing?

Tue Jul 12 04:22:34 UTC 2011

Comments inline:

On Mon, Jul 11, 2011 at 8:27 PM, Sai <sai at saizai.com> wrote:
> On Mon, Jul 11, 2011 at 22:15, Mike Schachter <mike at mindmech.com> wrote:
>> The grid search is your problem! It's unavoidable when you're
>> doing cross validation though, because you definitely want the
>> parameters that give you the lowest generalization error. You're
>> doing cross validation, right?
>
> Of course. That's kinda the main point - I want to know
> a) what the best performances is on various parameters of binning,
> vectorization method etc
> b) whether there's some trend that may be interesting in the C/G
> params over that, such as narrowness of optimum params, relationship
> to bin size, or the like
>
> Cross-validation results are the primary datum. ;-)

Sounds reasonable - are you deailng with spike data, or something
like EEG?

>> Although a GPU will help individual instances of training the
>> SVM classifer, in general you should parallelize the grid search
>> across cores.
>
> Sorry, I should've been clearer - I can easily use all 4 of my cores
> using matlabpool (and for that matter multiple remote cores if it's
> set up correctly), I just reported the single-core timings for
> simplicity.
>> Specifically, train an SVM classifer per hyperparameter
>> combination (kernel, bin size, etc).
>
> As in one training per hyperparam combo? If that were possible — i.e.
> if I didn't have to retrain the damn thing from scratch for every step
> in the grid search — that would drastically cut down my optimization
> time.

You have to retrain from scratch per hyperparam combo, but..
libsvm uses "one-vs-one" multi-class classification. That means,
per hyperparam combo, for 8 classes it's training something like
(8 choose 2) / 2 = 14 independent SVMs to do it. You might want
to look into SVM-lite for multi-class classification:

http://svmlight.joachims.org/svm_multiclass.html

>> Also, SVM kind of sucks for multi-class classification. Have you
>> considered random forests?
>
> I'm not familiar with that. Could you give me a pointer?

I'm most familiar with the random forests in R:

http://cran.r-project.org/web/packages/randomForest/index.html

The canonical paper on them is here:

http://oz.berkeley.edu/users/breiman/randomforest2001.pdf

> Ideally I would like to be able to compare multiple different
> classifier methods, as that's a large part of what interests me in the
> question - eg maybe there's some interesting case where some
> classifiers are better in one kind of binning and another set are
> better for another kind.
>
> Which of course means I still need to run even the slow ones. :-/

Definitely try random forests, they're the hot shit and people will
eat it up. Comparing things to linear classifiers (or SVMs with a
linear kernel) is kind of classic. If you want to never finish your
PhD keep going and try out neural networks, deep nets (neural
networks with many pre-trained hidden layers), a long ass detailed
paper can be found here:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.156.4732&rep=rep1&type=pdf

And more importantly, there is a python framework called Theano
which takes care of parallelizing things on the GPU for you:

http://deeplearning.net/software/theano/

Also consider perusing the software page on the Noisebridge ML
wiki, if you haven't already:

https://www.noisebridge.net/index.php?title=Machine_Learning

Hope that helps!

  mike