/mlg/ - Machine Learning General

Question

/mlg/ - Machine Learning General

Eli Cruz

Why isn't this a thing?
any anons want to get this started?

Attached: Kernel_trick_idea.png (1260x540, 271K)

March 12, 2018 - 08:28

Other urls found in this thread:

ieeexplore.ieee.org/document/1641014/
twitter.com/SFWRedditGifs

Adam Flores

>Sup Forums - Consumerism

March 12, 2018 - 08:51

Jose Fisher

not actually very useful unless you work with large data sets.
Also all the good solutions exist and are developed by people much more capable.

March 12, 2018 - 09:03

Jeremiah Evans

I do a lot of ML type work. The field has been ruined by business analysts running regressions and then putting "I build artificial intelligence" in their linkedins.

March 12, 2018 - 09:54

Zachary King

>t. ML brainlet

March 12, 2018 - 09:55

Gabriel Russell

Because there's maybe 5 people that frequent Sup Forums who are actually active in ML research, and 30 undergrads who finished their intro ML course.
You need people knowledgeable enough to drive a conversation, so why not talk about some of your work, OP?

March 12, 2018 - 10:00

Aiden Gutierrez

honestly most of Sup Forums aren't very proficient in math and insecurely fling >buzzword whenever they see ML. I've had more success on /sci/ than on Sup Forums

March 12, 2018 - 10:06

Nicholas Watson

How useful is k-nearest neighbor algorithms compared to linear regression?

March 12, 2018 - 10:08

Tyler Ramirez

Depends entirely on the dataset and problem.

k-nearest neighbors can allow you to solve non-linear problems and is generally more useful given a nice vector space.

nearest neighbors allows you to predict a sample based on some defined inner product in vector space.

linear regression allows you to solve an overdetermined system. i.e. Ax=b. linear regression is simply the projection of b onto im(A) which obviously minimizes b-Ax thus fitting a line.

They have different applications. k-nearest neighbors is good for movie reccomendations while linear regression is good for brainlet business marketing problems or basic science problems.

What you learned: try both and see which works.

March 12, 2018 - 10:25

Kayden Gonzalez

Thanks for your explanation which was better then the brief coverage my professor gave during his lecture lol

March 12, 2018 - 12:59

Robert Richardson

I fell into a machine-learning gig at my job about a year ago after showing some initiative at learning it on my own to solve a problem the company was having. I'm a total amateur, but I was able to get the project off the ground by watching a lot of lectures and reading a lot of documentation.

Now it's kinda my job and I'm totally not qualified for it, and I'm feeling the Imposter Syndrome hard. Right now I'm working on migrating our learning model from Scikit-Learn to TensorFlow, and TensorFlow is proving to be difficult for me to grasp.

That said, ML is a fascinating field. Just dunno if I can do it forever; it's really, really dry.

Attached: im just a cave man.jpg (617x480, 56K)

March 12, 2018 - 15:16

Caleb Harris

I started that coursera course but got caught in finals after 3 weeks or so and just forgot about. Time to redo I guess.
I still can't decide whether to try and make a carreer out of this, or pick up infosec and make this a fun hobby

March 12, 2018 - 17:10

Xavier Nguyen

>TensorFlow is proving to be difficult for me to grasp.
What's the hard bit for you? There were a few things I found challenging:

1) Realising that whenever the documentation talks about a "graph", it's really just talking about a mathematical expression implemented in Tensorflow.
2a) Realising that, in Tensorflow, tensors were legitimately the building blocks of the expressions and not only a buzzword.
2b) Tensorflow is roughly numpy plus the ability to minimise expressions using gradient descent. So I could often google questions as if they were to do with numpy instead of TF.

I'd feel like an even bigger retard if I didn't start learning TF with first-year knowledge of calculus.

March 12, 2018 - 17:43

Cooper Kelly

I feel like I get the concepts decently enough, but implementing it for our specific needs has been difficult for me.

We're using ML to classify wordy data (3-4 paragraphs of text per sample), and getting the data into the format that works with TF has been a bit difficult for me for some reason.

In Scikit-learn I used a TFIDF Vectorizer to vectorize our wordy data and then fed that high-dimensional sparse matrix into a SVM classifier.

In TF we're trying to move to a Word2Vec vectorization and a Sequence-to-Sequence model, but the implementation of these two has been difficult for me to grasp, I guess.

I've been working through some classes online to try to learn how to implement Word2Vec and Seq2Seq in TF, but I guess I've plateaued a bit in learning it so I feel stagnated.

March 12, 2018 - 17:49

Daniel Stewart

any recommended books if my aim is to make a neural network that learns to play tetris?

March 12, 2018 - 17:50

Noah Young

I have a master's in ml/ai and worked on ml for nlp for a well known company.

It's an interesting field, but you very quickly get diminishing returns and spend months on evaluation and analysis for a 1% relative improvement.

I'd rather be programming, so now I'm working on compilers.

March 12, 2018 - 18:03

Joseph Baker

Most people doing ML related stuff fall into three categories:
>the programmer who has no idea how any of it actually works besides the most basic stuff like calculating precision and maybe linear regression, but can piece together ready implementations to make something that kinda works
>the mathematician/business person who would probably fail writing fizzbuzz and for some reason uses GINI instead of AUC
>the absolute beginner who is very excited and still thinks machine learning is mostly about neural networks, also doesn't see the field for the very thinly veiled statistics that it is
It's very rare to meet someone who is a competent programmer and also understands how SVM or gradient boosting works. Why?

March 12, 2018 - 18:03

Jose Stewart

I dropped out of CompSci postgrad in AI back in 2010 because it was a bunch of pointless optimisation algorithms and techniques without any real world use.

Fast forward 2 years later and everyone is about this newfangled "Machine Learning" and "Data Science/Analytics". WTF. Look it up. It's just stats and not deep learning or AI.

March 12, 2018 - 18:09

Thomas Gonzalez

>3-4 paragraphs
>Seq2Seq
I thought that only worked on short- to medium-length sequences. Maybe a bag of words model would be best, depending on what you're doing. What are you doing anyway?

I should mention I'm just a hobbyist who stopped being interested in natural language processing after early 2015, so you should take anything I say with a grain of salt.

>It's very rare to meet someone who is a competent programmer and also understands how SVM or gradient boosting works. Why?
Because they're not magic, like deep NNs. Either you have to spend time hand-crafting a kernel for your SVM, or your computer has to spend time on a horribly inefficient gradient boosted model. Nothing fun or sexy about that.

March 12, 2018 - 18:11

Camden Price

>everyone is about this newfangled "Machine Learning" and "Data Science/Analytics"
jewish tricks just as the one that originally started this whole IT craze, can''t say I'm surprised.

March 12, 2018 - 18:13

Austin Butler

Basically we're wanting to classify short descriptions into categories.

i.e.: This short description is likely category A; this other description is likely category B, etc, but with a few layers of classification.

That is, we have ~7 top-level categories, and then each of those categories have a handful of sub-categories under them. So something like ~35 total categories, but we'd have a multi-level classification model to increase the chance of a correct prediction.

Our current SVM model works decently well, but we don't see it working on an enterprise level, which is the end goal. We're hoping to see increased prediction accuracy with neural network models and word2vec in TensorFlow.

Again I'm a total amateur who kinda fell into this gig so I'm no expert. There's a lot of stuff I'm fuzzy on just by virtue of having crammed a lot of this info over the past ~8 months.

March 12, 2018 - 18:54

Parker Taylor

The programmer type really bugs me. Probably because I come from the mathematics end. At least I know my code is pretty much garbage in general, but I'm only paid to make the models and get simple implementations going.

March 12, 2018 - 18:58

Jaxson Price

I'll be graduating in an ML topic in about a month. We classify images and sensor data gathered in vehicles for environment detection purposes.

One huge problem in ML that I think appears a lot now is that so many researchers don't actually do their analysis. It's just gathering data, training algorithm X, and showing >90% accuracy somewhere. No better metrics, no other information, no feature analysis or whatever. There was a paper recently that classified 'criminals' by their passport photos, and (iirc) all they did for verification was a Chi Squared test.
Some papers literally read like
> "The algorithms do their magic things and stuff to achieve 90% performance"
Yea no wonder some people say it's a fad, or that ML/AI is 'scary'.

Also it's like Blockchain, the term alone apparently draws interest, so companies tend use it for no reason at all.

This user is right. cNN's are used for new hip stuff, and companies that use NN's but do not divulge their 'secret proprietary NN algorithm' which usually means they are running huge hacks to achieve their presented performance numbers.
Almost every open industry-standard library/package the 'old' algorithms, SVM/GB/Logreg or even C4.5.
Many used algorithms are even less 'sophisticated', but are simpler to use, there is almost nobody who really uses cNN's for face classification, but just match HAAR classifiers, eigenfaces etc.

ieeexplore.ieee.org/document/1641014/
This paper gives a pretty nice overview of pros/cons of kNN.
Also the algorithm they propose is hella fast

March 12, 2018 - 19:25

1 2 3 Next

/mlg/ - Machine Learning General

Last threads