Skip to main content

Is today's world all about creativity and ideation?

Are they the seeds to be nurtured to bring in automation, innovation and transformation.  There is a saying, necessity is the mother of invention. I would say, innovation is amalgamation of creativity and necessity.  We need to understand the ecosystem, to apply creativity and identify the ideas to bring in change. We need to be competent with changing ecosystem and think beyond the possible. What is the biggest challenge in doing this? "Unlearning and Learning", we think the current ecosystem is the best. Be it health, finserve, agriculture or mechanical domain, we need to emphasize with the stakeholders, to come up with the strategy to drive. The very evident example here is the quality of life is changing every millisecond. Few decades back the phone connection was limited to few, but today all the millennials are having a mobile phone. Now phone is not just a medium to talk, but are so powerful devices that an innovative solution can be developed on it.

Why Data Scientist prefer R?

Hello Data Scientists,

Let me continue from my last blog http://outstandingoutlier.blogspot.in/2017/08/what-skills-one-need-to-be-successful.html  :: “What skills one need to be a Successful Data Scientist?” where I wrote importance of Data Scientists to be all-rounder with  Analytical skills, Understand Mathematics, Statistical Analysis, usage of Power Tools, Domain expert and last but not least presentation/visualization skills. Taking it forward in this blog I will delve into Why I as Data Scientist prefer R over other tools. Choosing R does demean the power of other tools like SAS, Python etc. Every tool has its own capability and limitation.

“Excellence is never an accident. It is always the result of high intention, sincere effort, and intelligent execution; it represents the wise choice of many alternatives - choice, not chance, determines your destiny.” ― Aristotle

To be proficient one should know theory and practical aspect of the topics. To be productive one should be proficient but right choice of tool make one smart. While ago when I was exploring what programming, tool should I tried performing objective comparison. Before I get into pros and cons, let me make it very clear this was my personal choice and I would suggest you to evaluate for your needs. It is no way an easy choice to pick one language by default for data analysis. This blog will help you objectively decide which language to pick based on different parameters.

Tool
Attributes
Python
R
SAS
SPSS
License
Freeware
Freeware
License
License
Install
Easy to install
Easy to Install
Need server and detailed requirements
Need detailed requirements
Easy to learn
Easy to understand syntax
There is a learning curve involve initially, later easy to use.
Interactive UI and easy to learn
Good UI
Visualization
Good
Excellent graphical output
Excellent visualization
Good
Purpose
General Purpose
Statistical purpose only
DATA mining, and Statistical analysis
Extended later for statistical purpose
Usability
More lines of code to get output.
Low level language. Less line of code to get output
High level language like SQL
Pull down menu drive and 4 GL command.
Adopted by
Engineers and software professionals
Researchers & Consultants
Enterprise level software
Enterprise level software
Architecture

Consumes memory
Well managed on servers
Well managed
Graphical interface
Editor driven
R editor by default but R Studio is easy to use UI
Interactive and accessible UI.
Good built in interactive UI
Support

Very strong open source community
Managed very well because of usage
Decent support because of restricted preference.

It is very clear no single tool is a clear winner and hence if there is a huge data (in TB) we should go for SAS while we want to have it install on laptop and still have full power to perform all statistical analysis, one should go for R.

However, let me now list my top 10 reasons to go for R:
Open source software.
Easy to install across platforms.
Standalone computing and individual servers.
Extensive library of statistical packages.
Extra ordinary Data Visualization.
RStudio is big plus, easy to use IDE.
Easy to integrate with other packages like Excel, SAS.
Easy to create scripts and pass on to other stakeholders.
Trend for R in flying high, it’s in thing in Data Statistical category.
Higher average salary for R practitioners.


R is one of the statistical programming language I chose to start with however as market dynamics change and we mature, based on need I might also pick up Python or SAS or SPSS etc.
In my next blog, I will begin with “how to install R”. Though it is a matter of choice however I will pick R for windows because I have a laptop with windows OS.


Thank you once for patience and your precious time screening through this article, I hope it must have insightful and aided you in deciding which language to pick to be a successful Data Scientist. Kindly share your valuable and kind opinions. Please do not forget to suggest what you would like to understand and hear from me in my future blogs.

Thank you, 
Outstanding Outliers :: "AG"



Comments

Popular posts from this blog

Z and T distribution values using R

Hello Data Experts, Let me continue from my last blog http://outstandingoutlier.blogspot.in/2017/08/normality-test-for-data-using-r.html “ Normality test using R as part of advanced Exploratory Data Analysis where I had covered four moments of statistics and key concept around probability distribution, normal distribution and Standard normal distribution. Finally, I had also touched upon how to transform data to run normality test. I will help recap all those 4 moments. Those 4 moments of statistics. First step covers Mean, Median and Mode, it is a measure of central tendency. Second step covers Variance Standard Deviation, Range, it is a measure of dispersion. Third step covers Skewness, it is a measure of asymmetry. Fourth step covers Kurtosis, it is a measure of peakness. To get standardized data use “scale” command using R whereas run “pnorm” command to get probability of a value using Z distribution. To understand if data follows normality we can e

Practical usage of RStudio features

Hello Data Experts, Let me continue from my last blog Step by Step guide to install R :: “Step by Step guide to install R” where I had shared steps to install R framework and R Studio on windows platform. Now that we are ready with Installation and R Studio, I will take you through R Studio basics. R Studio has primarily 4 sections with multiple sub tabs in each window: Top Left Window: Script editor: It is for writing, Saving and opening R Scripts. Commands part of Script can also be executed from this window. Data viewer: Data uploaded can be viewed in this window.   Bottom Left Window: Console: R Commands run in this window.   Top Right Window: Workspace: workspace allow one to view objects and values assigned to them in global environment. Historical commands: There is an option to search historical commands from beginning till last session. Beauty of this editor is that historical commands are searchable. Once historical commands are searched they can be

Code Branch and Merge strategies

Learn Git in a Month of Lunches Hello Everyone, IT industry is going through a disruptive evolution where being AGILE and adopting DevOps is the key catalytic agent for accelerating the floor for success. As explained in my earlier blog, they complement each other rather than competing against one another. If Leaders will at the crossroad where in case they need to pick one what should be their pick. There is no right or wrong approaching, it depends on the scenario and dynamics for the program or project. I would personally pick #DevOps over Agile as its supremacy lies in ACCELERATING delivery with RELIABILITY and CONSISTENCY . This path will enable and empower development teams to be more productive and prone to less rework. Does this mean adopting DevOps with any standard will help reap benefits? In this blog, I will focus on importance of one of the standard and best practice around Code branching and merging strategy to get the desired outcome by adopting DevOps. To