Links for "Provisioning" an AWS Windows Instance for Data Science

© Copyright 2015, Professor George S. Easton

The following links can be used to "provision" an Amazon Web Services Windows instance with Python and R for doing data science.

Firefox Browser

https://download.mozilla.org/?product=firefox-stub&os=win&lang=en-US

Note: On a server, you should generally not browse the internet as this creates the risk of malware getting on the server and then infecting all of the machines that the server serves. Thus, while Internet Explorer is already on the Windows 2012 server in AWS, its security level is set to high. Further, things are set up so that it is impossible (or at least very difficult) to change it. Thus, we need to install another browser in order to easily provision our instance. Firefox is very easy to install. Keep in mind, however, this caution about browsing the internet on a server.

ClamWin: Open Source Anti-Virus Scanning Software

http://downloads.sourceforge.net/clamwin/clamwin-0.99.1-setup.exe

After you click this link, WAIT! There is all kinds of "stuff" on the page that comes up that looks like it might be the download button or link. They are trying to get you to click through and/or install other programs. So just WAIT.

Note: An AWS Windows instance does not include any anti-virus software by default. There are many free anti-virus packages available, but, surprisingly, almost all of these "detect" that the installation is on a Windows Server operating system and then the anti-virus packages refuse to install. Apparently, the Windows Server operating system is considered "professional" software and it is thus not eligible for the fee versions of the virus scanning software.

ClamWin is an exception. It is both free and open source. It is the only free option that I have found that works on the Windows Server operating system without some kind of hack. Thus, it is the only free option that I found that one can easily use with Windows Instances in AWS.

One thing to note, however, is that ClamWin does not do real-time scanning (see below).

Clam Sentinel: Active Real-Time Scanning Using ClamWin

http://sourceforge.net/projects/clamsentinel/files/latest/download

Once again, WAIT after you click this link.

Clam Sentinel adds real-time virus scanning to the ClamWin anti-virus software.

IMPORTANT! When installing the programs below (especially Anaconda Python), temporarily turn off Clam Sentinel. To do this, go to your system tray, click on the shield-shaped icon corresponding to Clam Sentinel, and click stop in the menu that appears.

Anaconda Python

https://repo.continuum.io/archive/Anaconda2-4.1.1-Windows-x86_64.exe

Anaconda Python is free and includes many very useful modules already installed as well as other useful tools and software (such as the Spyder IDE). I use Python version 2.7. This is a stable "production" version of Python. I do not yet use Python 3.X as I have found it to be unstable and many useful modules have not yet been ported to Python 3.

NOTE: You MUST turn off your virus scanning software during the installation of Anaconda Python!!! ClamWin and ClamSentnel WILL break the Anaconda Python installation!

The R Statistical Language

https://cran.revolutionanalytics.com/bin/windows/base/R-3.3.1-win.exe

The R statistical language is free and open source, and it is probably the most widely used software for research in statistical methods. It is a very flexible language with many installable packages that allow you to compute almost any statistical method.

RStudio

https://download1.rstudio.org/RStudio-0.99.903.exe

RStudio is a free IDE (integrated development environment) for R.

CYGWIN: Linux-like Tools for Windows

http://cygwin.com/setup-x86_64.exe

Note: The AWS instances that I use when I teach data science to my MBA students are Windows instances. I made the decision to use Windows instances so that my students do not have to learn another operating system as well as the many other things they will need to learn about data science. However, some Unix/Linux tools are very useful, especially for handling large data sets. CYGWIN provides those tools in a Windows environment.

Note: CYGWIN is pronounced sig-win.

IMPORTANT! When you finish installing the software, turn Clam Sentinel back on. It is critical that you have real-time virus scanning turned on at all times (except when installing software). To turn it back on, go to your system tray, click on the shield-like icon belonging to Calm Sentinel, and the click "start" in the menu that appears.

© Copyright 2015, 2016, Professor George S. Easton