/ai/ AI/Deep Learning General

Also, AWS and Paperspace GPU VPS instance credit/promo code begging thread.

Other urls found in this thread:

github.com/soumith/torch-android
chainer.org
twitter.com/AnonBabble

What are some alternatives to TensorFlow?

Everyone that isn't Google currently uses PyTorch.

What advantages does it offer?

/thread

How would I get started working with AI and Deep Learning?

Framework less retardedly designed and less buggy.

You still have to use TensorFlow if you want to use Google TPU or want to have the trained model be exportable to Android devices since they're all locked to TensorFlow.

Has anyone made a library that isn't written in fucking PYTHON the slowest language on the market. Like jesus fuck guys you aren't writing webdev meme shit you're writing cutting edge high performance software why can't you use an actual language.

It's a thin Python wrapper around C++ code. There's barely any performance difference between using the Python wrapper vs using full C++.

Extremely flexible, i.e. very easy to extend with custom layers and do little hacks to test a theory.
Extremely fast. RNNs in tf can be as much as 10x slower on gpu than in pytorch.
Very clean design, making it easy to modify the internals when needed.
Generally decently tested/bugfree, using the same technology the users use. Compare with tf which is developed and tested against google's internal tools, which caused huge issues at launch since the official nvcc and the inhouse google one don't behave the same at all.
No keras niggery. Keras is a buggy piece of crap and francois chollet (its dev) is poettering-tier in dealing with bugs and issues. It doesn't even do input validation right so if e.g. your input is the wrong size somehwere, you have to chase down an assert in the innards of keras code to figure out what actually happened, which usually means reading the whole code in each section for 10 levels around the point of assertion.

Caffe, but it's a mess. Works decently for cnns but everything else's a hack. It has almost no users because of that.
Torch is lua at its base, pytorch takes the torch's C backend and rewrites the higher-levels (originally lua) into python.
There's also weka but it's a big joke.

Speed does in fact matter, especially in production, and python is a huge bottleneck in practice for that. Not to mention the GIL. But researchers absolutely do not want to hunt down fucking memory bugs for days when they're trying to check if an idea could work.

The real problem with python is lack of typo checks/type checks/etc., because you end up running your model for 5 hours and then it crashes because you typo'd your model's name when saving, so you lose all your work.

I wrote a symbolic computation library in rust (basically like theano) for fun but arrayfire is kinda shitty so I gave up on it. Maybe someone else can figure out a good language and take over. Rust is too much trouble for researchers for sure though.

Also a huge advantage of pytorch is that it can actually do non-retarded dynamic computation. Tf has hacks to try to do that but it doesn't really work out. Then again I haven't really worked on applications where that was much useful.

Torch is available for android devices as well: github.com/soumith/torch-android

Read the Deep Learning book by bengio et al. If you want to try things out on realish problems, look for kaggle competitions. Even those that are over still provide you with data to train and test. They also often allow you to submit your model past the end of the competition for evaluation "for fun".

i've used tensorflow and thought it was terrible. i managed to get the 2 neural networks i wanted to work, but god it's fucking shit design. i want to try pytorch but it works only on linux right now right? i can use linux but it's gonna break my workflow when i want to do other things

>bengio et al

i.e. You might as well give up now if you're not already three years into your graduate studies into deep learning.

You can try with docker (there are several ready-made torch+gpu configs I think) or a vm with gpu passthrough. Otherwise you can get an aws or azure gpu instance. Those are your best bet. You could always go with theano + lasagne but theano development has halted recently.

99% of the time you're gonna be waiting for the gpu to compute shit
the language you use to describe the layers and the training parameters doesn't matter much

/ourDLkawaiFramework/
chainer.org

In some applications like RL where most of the workload does in fact happen on CPU, this is no longer true (except DRL which also uses DL and thus GPU for a good chunk of the workload, but even then both devices may burden the system), so it's short-sighted to say that desu.
Also doesn't take into account what happens after you're done training and you want to actually do something with the model: most companies before tf implemented their own custom inhouse C or C++ runtime for that purpose (it loads the weights of the model with the correct topology and can fprop but that's it).

I've never seen anyone beside the japs even talk about chainer. It's probably very shitty. Not that I've had a chance to ever try it. Can someone redbull me on chainer?

I'll be running CNNs on an FPGA by the end of the semester, hopefully. This is gonna be a fun project

Do I need to get a graduate degree if I want to get a non-meme job doing ML/AI ?

I'd say I'm familiar with the principles behind most popular algorithms and architectures, I manage to get a basic understanding (enough to implement) of most of the academic papers I encounter and I've worked with TF/Keras/Sklearn on a few practical projects.

Yet I still know I'm not good enough to come up with some really creative solution to certain problems or to fully understand why screwing with a specific hyperparameter influences the results the way it does, because I don't have enough of the math background.

Basically I just have a decent hands-on command of deep learning, but I'm weak on math/theory...

I'm really interested in the field though, so I'm wondering if it'd be worth pursuing a degree in it (I'm not that young either, 30 y/o)

How well can you code?
Where I work, there is a "data science" group that comes up with all the high level algorithms and prototypes in R and Python, and then the "engineering" group that implement and optimizes them (sometimes in C++, Scala).
The data science group is an MS in math/stats/cs at a minimum. The engineering group is more CS majors, but they also do a lot of non-mathy stuff like fuck around with data bases and sysadmin type stuff to keep everything running.

I can code pretty well in R, Python, Java and C.
I did some babby stats in college. I was a business/economics major and got an MS. I just had no interest nor talent for business and started working as a dev instead.

From what you describe, I think I would be more interested in the data science part of the work. Can you give some more details on what they do?