Are they the seeds to be nurtured to bring in automation, innovation and transformation. There is a saying, necessity is the mother of invention. I would say, innovation is amalgamation of creativity and necessity. We need to understand the ecosystem, to apply creativity and identify the ideas to bring in change. We need to be competent with changing ecosystem and think beyond the possible. What is the biggest challenge in doing this? "Unlearning and Learning", we think the current ecosystem is the best. Be it health, finserve, agriculture or mechanical domain, we need to emphasize with the stakeholders, to come up with the strategy to drive. The very evident example here is the quality of life is changing every millisecond. Few decades back the phone connection was limited to few, but today all the millennials are having a mobile phone. Now phone is not just a medium to talk, but are so powerful devices that an innovative solution can be developed on it.
Hello Data
Scientists,
Let
me continue from my last blog http://outstandingoutlier.blogspot.in/2017/08/what-skills-one-need-to-be-successful.html
:: “What skills one need to be a
Successful Data Scientist?” where I wrote importance of Data Scientists to be
all-rounder with Analytical skills,
Understand Mathematics, Statistical Analysis, usage of Power Tools, Domain
expert and last but not least presentation/visualization skills. Taking it
forward in this blog I will delve into Why I as Data Scientist prefer R over
other tools. Choosing R does demean the power of other tools like SAS, Python
etc. Every tool has its own capability and limitation.
“Excellence
is never an accident. It is always the result of high intention, sincere
effort, and intelligent execution; it represents the wise choice of many
alternatives - choice, not chance, determines your destiny.” ― Aristotle
To
be proficient one should know theory and practical aspect of the topics. To be
productive one should be proficient but right choice of tool make one smart.
While ago when I was exploring what programming, tool should I tried performing
objective comparison. Before I get into pros and cons, let me make it very
clear this was my personal choice and I would suggest you to evaluate for your
needs. It is no way an easy choice to pick one language by default for data
analysis. This blog will help you objectively decide which language to pick
based on different parameters.
Tool
Attributes
|
Python
|
R
|
SAS
|
SPSS
|
License
|
Freeware
|
Freeware
|
License
|
License
|
Install
|
Easy to install
|
Easy to Install
|
Need server and detailed
requirements
|
Need detailed requirements
|
Easy to learn
|
Easy to understand syntax
|
There is a learning curve involve
initially, later easy to use.
|
Interactive UI and easy to learn
|
Good UI
|
Visualization
|
Good
|
Excellent graphical output
|
Excellent
visualization
|
Good
|
Purpose
|
General Purpose
|
Statistical purpose only
|
DATA mining, and
Statistical analysis
|
Extended later for statistical
purpose
|
Usability
|
More lines of code to get output.
|
Low level language. Less line of
code to get output
|
High level language like SQL
|
Pull down menu drive and 4 GL
command.
|
Adopted by
|
Engineers and software professionals
|
Researchers & Consultants
|
Enterprise level software
|
Enterprise level software
|
Architecture
|
Consumes memory
|
Well managed on servers
|
Well managed
|
|
Graphical interface
|
Editor driven
|
R editor by default but R Studio is
easy to use UI
|
Interactive and
accessible UI.
|
Good built in interactive UI
|
Support
|
Very strong open source community
|
Managed very well because of usage
|
Decent support because of restricted
preference.
|
It
is very clear no single tool is a clear winner and hence if there is a huge
data (in TB) we should go for SAS while we want to have it install on laptop
and still have full power to perform all statistical analysis, one should go
for R.
However,
let me now list my top 10 reasons to go for R:
Open source software.
Easy to install across platforms.
Standalone computing and individual
servers.
Extensive library of statistical packages.
Extra ordinary Data Visualization.
RStudio is big plus, easy to use IDE.
Easy to integrate with other packages
like Excel, SAS.
Easy to create scripts and pass on to
other stakeholders.
Trend for R in flying high, it’s in
thing in Data Statistical category.
Higher average salary for R
practitioners.
R
is one of the statistical programming language I chose to start with however as
market dynamics change and we mature, based on need I might also pick up Python
or SAS or SPSS etc.
In
my next blog, I will begin with “how to install R”. Though it is a matter of
choice however I will pick R for windows because I have a laptop with windows
OS.
Thank
you once for patience and your precious time screening through this article, I
hope it must have insightful and aided you in deciding which language to pick to
be a successful Data Scientist. Kindly share your valuable and kind opinions.
Please do not forget to suggest what you would like to understand and hear from
me in my future blogs.
Thank you,
Outstanding Outliers :: "AG"
Thank you,
Outstanding Outliers :: "AG"
Comments
Post a Comment