Links for "Provisioning" an AWS Windows Instance for Data Science
© Copyright 2015, Professor George S. Easton
The following links can be used to "provision" an Amazon Web Services Windows instance
with Python and R for doing data science.
Note: On a server, you should generally not browse the internet as this creates the risk of
malware getting on the server and then infecting all of the machines that the server serves.
Thus, while Internet Explorer is already on the Windows 2012 server in AWS, its security level
is set to high.
Further, things are set up so that it is impossible (or at least very difficult) to change it.
Thus, we need to install another browser in order to easily provision our instance.
Firefox is very easy to install. Keep in mind, however, this caution about browsing the internet
on a server.
ClamWin: Open Source Anti-Virus Scanning Software
After you click this link, WAIT! There is all kinds of "stuff" on the page that comes up
that looks like it might be the download button or link. They are trying to get you to click
through and/or install other programs. So just WAIT.
Note: An AWS Windows instance does not include any anti-virus software by default.
There are many free anti-virus packages available, but, surprisingly, almost all of these
"detect" that the installation is on a Windows Server operating system and then the
anti-virus packages refuse to install.
Apparently, the Windows Server operating system is considered "professional" software
and it is thus not eligible for the fee versions of the virus scanning software.
ClamWin is an exception. It is both free and open source.
It is the only free option that I have found that works on the
Windows Server operating system without some kind of hack.
Thus, it is the only free option that I found that one can easily use
with Windows Instances in AWS.
One thing to note, however, is that ClamWin does not do real-time scanning (see below).
Clam Sentinel: Active Real-Time Scanning Using ClamWin
Once again, WAIT after you click this link.
Clam Sentinel adds real-time virus scanning to the ClamWin anti-virus software.
IMPORTANT! When installing the programs below (especially Anaconda Python), temporarily
turn off Clam Sentinel. To do this, go to your system tray, click on the shield-shaped
icon corresponding to Clam Sentinel, and click stop in the menu that appears.
Anaconda Python is free and includes many very useful modules already installed as
well as other useful tools and software (such as the Spyder IDE). I use Python version 2.7.
This is a stable "production" version of Python. I do not yet use Python 3.X as I have found
it to be unstable and many useful modules have not yet been ported to Python 3.
NOTE: You MUST turn off your virus scanning software during the installation of Anaconda Python!!!
ClamWin and ClamSentnel WILL break the Anaconda Python installation!
The R Statistical Language
The R statistical language is free and open source, and it is probably the most widely used
software for research in statistical methods. It is a very flexible language with many
installable packages that allow you to compute almost any statistical method.
RStudio is a free IDE (integrated development environment) for R.
CYGWIN: Linux-like Tools for Windows
Note: The AWS instances that I use when I teach data science to my MBA students
are Windows instances. I made the decision to use Windows instances so that my students
do not have to learn another operating system as well as the many other things they
will need to learn about data science.
However, some Unix/Linux tools are very useful, especially for handling large data sets.
CYGWIN provides those tools in a Windows environment.
Note: CYGWIN is pronounced sig-win.
IMPORTANT! When you finish installing the software, turn Clam Sentinel back on.
It is critical that you have real-time virus scanning turned on at all times (except when
installing software). To turn it back on, go to your system tray, click on the shield-like icon
belonging to Calm Sentinel, and the click "start" in the menu that appears.
© Copyright 2015, 2016, Professor George S. Easton