2017

> 2017
> not using R

Why aren't you working as a data Scientist user?

Other urls found in this thread:

tensorflow.rstudio.com/
twitter.com/NSFWRedditImage

python is gettng better tools

why are you using a meme language user? You surely know Microsoft has development tools for R

because R is a meme language

python is better for scripting and things like stan are better for bayesian statistics

interactive visualisation is also becoming more important and things like bokeh make it easy with python

this was developed at my university (UoA)

I just like the scipy stack. It's what I used first

I also used Java first, but I learned with my mistakes user :^)

R is a meme language that don't even have access to Tensorflow. How are you going to train machine learning models on distributed machines with R?

there are R interfaces to stan

tensorflow.rstudio.com/

Totally agree.

Bu... But I am, user.
Data Science Engineer in fact.

oh... Well... Then R can do the same in Data Science as Python can... I can't think of anything else to complain about.

the only reason to choose between R or Python is imo which libraries you might need or which you're more familiar with

i prefer R for data exploration/analysis over Python because I find R to have better REPLs/IDEs for data analysis and I like how R has lots of convenience functions and syntactic sugar making data exploration very easy and straightforward

> R and Python faggots
> No SAS master race

you're disapointing Sup Forums

SAS a shit and can't compete with R on its best days

quit that SAS bullshit the real thing is Tableau ,much better data visualization

The SAS gods are smilling at me impe"R"ials, can you say the same?

I prefer R for Data wrangling and Python for everything else. But mostly because I am not as used to Pandas. But data scientists should know both.

R has the most comprehensive collection of machine learning and statistical analysis packages, what the fuck are you talking about user?

no

I'm a graduate researcher in biology and I used R a lot in undergrad but stepped up to python.

I rarely touch R now, only when a tool someone else wrote is only in R.

R is a memory hog. Yeah you can optimize it with fancy packages. But who has the time for that?

Python you have numpy which a lot of it is in C. Also Cython, PyPy, and others that can really speed up big data projects.

Python (SciPy, Numpy, Pandas, Scikit-Learn, TensorFlow) >>>>> R

Python is Chad, R is cuck
I had to use SAS for my undergraduate stats class. Boy it sucks and is proprietary

you're mistaking Python for a dick, stop sucking it faggot

>Python you have numpy which a lot of it is in C.
most of R is implemented under the hood as C functions. if your R code was slow then you were fucking up somewhere

>ctrl+f julia
>0 results
You're missing out.

also a grad student here, but in CS. it's partly about the efficiency but honestly a big part of why nobody i know uses R is that it's just a huge bitch to get good at compared to python, julia, etc...

if you're going to deal with a memory-intensive shitshow, you might as well deal with a language that's reasonably readable like python.

>data """scientist"""

>R code was slow then you were fucking up somewhere
I didn't use dyplyr or whatever it's called.

For big data projects I find that R will consume 12GB of RAM vs 4GB in python. Like with a data-frame from a file 4GB in size

>data """science""" (((engineer)))

call us when julia is stable

> is that it's just a huge bitch to get good at
i've never understood this. sure, some of the basic syntax is antiquated compared to Python but most of the concepts are nearly direct translations between the two languages. the only thing that Python has that R doesn't is list comprehensions.

you don't need to use dplyr, you just need to properly leverage vectorized functions.

>grad student here, but in CS
What's a typical thesis for a CS grad student? I'm in bioinformatics (genetics).

R has bioconductor which has a lot of libraryes for microarrays (the platform 23andMe uses for genotyping), which is pretty cool.

>you don't need to use dplyr, you just need to properly leverage vectorized functions.
>time to learn to optimize R >>> time to learn to optimize python
R is cucked.

no I won't.

R does crazy shit like make copies of objects really readily. it's reasonably fast at operations but the design choices its creators made in implementing the language aren't all that suitable for the use cases of a lot of people.

i remember reading about why they decided to go this route and thinking it was kind of reasonable, but i've forgotten the rationale. R is a great example of how a language that's well-designed for a certain community and/or use case can be pretty bad for other communities/use cases.

which shouldn't be a revelation, but it is.

you're overestimating the time required to learn not to write shitty code in R

1. check and see if the function accepts vector inputs
2. if it does, supply the whole vector at once

>implying you can't be a data scientist working in any language

>but the design choices its creators made in implementing the language aren't all that suitable for the use cases of a lot of people.
oh sure, i absolutely would never recommend R as a general use programming language. some of the stuff it does is batshit insane.

but we're talking data analysis here - that's R's bread and butter, the exact use case it was designed around

>2. if it does, supply the whole vector at once
see this
>R does crazy shit like make copies of objects really readily

this is why it's not a good idea.

>data """science""" (((engineer)))

What do they even do? I see these fuckers in the office, not friendly nor social, and I don't see them providing any worth to the company. They sit around and talk about some retarded games in the gaming slack channel instead of actually doing any work.

if you supply the whole vector at once, it sends the whole vector to a single function call. if you iterate over the vector, you have R's molasses overhead on every single separate function call

it might be shittier on memory, but it should still be more time efficient

I use R for plotting stuff, but it is not a programming language.

yes it is. you may not like the language, but it's still a programming language

R doesn't give great error messages (read: exception messages are cryptic), the help documents are relatively inaccessible to beginners (compare to PHP, Python, or Javascript documentation as some beginner-friendly examples), and lots of commands and variables are not intuitively named, especially if you come from other languages.

do you mean in content, or length? the content of a CS thesis varies a lot. our department is broken into 3 areas — systems, theory, and applications — and within those areas you have groups (in applications we have Graphics, AI, NLP, Data Management & Mining, HCI, and Computational Biology).

in the most general sense, a typical CS student's thesis here will more or less be a synthesis of the papers we've published at conferences (conferences being the primary venue for us, rather than journals).

Okay, I can see how those would be problems.

yeah i'm trying to remember the reason R does these full copies because i vaguely remember it seeming like a reasonable explanation for mathematicians or something, but in practice for people doing with lots of data (i do social computing research so by definition we're almost always dealing with tons of quantitative data), if you use R you draw as small a border around it as possible and only pipe in exactly the data you want it to deal with.

but with numpy, pandas, etc... coming around, R's territory keeps shrinking and shrinking.

It has memory leaks if you make a for loop.
It cannot be used to make large programs.
If you make something that takes more than 3 seconds to run, you are using the wrong tool.

none of that makes it not a programming language

i mean all of that being said, R is still probably worth learning. every language has warts, but R has a clear, common use case, which is more than you can say for a lot of languages that are clumsy to learn. it's like LaTeX — yes, it sucks to learn it, but the alternatives aren't really ready yet.

although a friend of mine literally wrote a paper in Adobe InDesign out of frustration with LaTeX's figure placements, and now that he's satisfied with his template for the ACM conferences he submits to, i think he's fucking done with LaTeX. and python and javascript have nice data visualization libraries (and python has good data analysis libraries), so maybe they're both losing ground

Because I work as a QA Support Analyst.

They're probably people that get hired because they have autism.

cool thanks for the insight :)

what year are you? I'm starting my 5th .,.

Okay, I stand corrected.
It is a language that cannot be used to make real™ programs.

Real developers use COBOL for data analysis.

Liblinear > glmnet

>Caring about Microsoft
Son, I am disappoint

>Not using \floatbarrier if you want to micromanage figure placement

but i am, my stat class's labs are all in R. our local industries requested it, and my college is smart and listens to local industry about what it wants from comp sci grads.

This is a good point but it is important to understand a data scientists workflow is not always about making programs. I find R is way easier for one off analyses. If I need to put something into production I'll use python. Either way serious computation needs to be written in C/C++/fortran.

for a stat class you shouldn't really have even needed industry to ask for it to be in R

> I only use non-proprietary software.

Don't worry user it's just a phase.

yes. That is what I use it for as well.

>I use Windows
>Ever

I wish I had a witty excuse for this autism. But I don't, there's no excuse. You are a failure.

R is an academic circlejerk language for people without any prior programming knowledge

people without prior programming knowledge that have a higher paycheck than you coding monkey :^)

>call us when julia is stable
Not knowing how to handle your waifu? Are you Sup Forumsermin?

I'm an an actual meme "data scientist" and this is correct.

Everywhere I've worked, data science teams are basically split into two: math/statsfags who prototype algorithms in Python, R, or Matlab; and CSfags who take those algorithms and optimize them in C/C++ or even Fortran (and Scala is becoming popular).

It doesn't surprise me that Sup Forums would hate R, because R is basically made by and for statisticians.