Bias vs. Variance Tradeoff, Cross-Validation, and Overfitting in Prediction (Part 2)

This is the second part of the two-part video series that discusses the bias vs. variance tradeoff, overfitting, and basic cross-validation.

The R source code used in this video can be found here.

Note: You will want to play this video at 1080p HD and full screen in order to be able to see what is going on.

Bias vs. Variance Tradeoff, Cross-Validation, and Overfitting in Prediction (Part 1)

This video discusses the bias vs. variance tradeoff, overfitting, and basic cross validation. There are two parts. Part 1 (below) discusses the ideas. Part 2 shows an example using regression trees.

A pdf file of the slides used in this video can be obtained here.

Python With Spyder 12: Dictionaries

This is the 12th in a series of videos providing a tutorial on Python 2.7 using Anaconda Python and the Spyder IDE. Click here to go to a “home page” for the video series.

This video introduces dictionaries in Python. Dictionaries are Python’s implementation of the idea of key-value pairs.

The source code used in this video can be found here. You can right click on the link and use “Save As” to save the file.

Note: The source code files are plain text files with a “.txt” extention. You will probably want to change the extensions to “.py” after you download them. If you do so, please be aware that if you have Python installed, the file will become executable, so that it will run if you click on it (accidentally or otherwise).

The video is about 31 minutes long.

PWS12-DSSThumb

Next Video: For Loops

Video Index: Dictionaries

Click on the topics below to jump to that location in the video.

Time Topic
00:00Title slide
00:06Introduction
00:44Creating a dictionary using curly brackets {}.
01:01Key-value pairs separated by the colon (:).
02:06Keys and values do not have to be strings.
02:20Keys and values can be any Python object.
02:27Keys must be unique.
02:40Using a duplicate key overwrites earlier value.
03:05Run the Python code to create the dictionary.
02:12Cells of code in the editor (review)
03:38Run the code in a cell using cntrl-Enter.
03:47Make the Python console larger.
04:06Print out the dictionary in the Python console.
04:37Reference the value corresponding to a key.
05:04Execute a line of code using F9.
05:22Assign a new value corresponding to a key in the dictionary.
06:36Assign a value to a new key not already in the dictionary created a new key-value pair.
07:31Dictionaries in Python do not have an order.
07:51Cannot reference dictionary items using an index.
08:31Dictionaries cannot be sorted.
09:04Second way to create a dictionary starting with an empty dictionary and adding key-value pairs.
09:40Create an empty dictionary using curly brackets {}.
09:53Load key-value pairs into the dictionary by assigning values to new keys.
10:30Virtually everything in Python is an object.
10:43Dictionaries are objects.
10:52The class constuctor for dictionaries in Python is dict()
10:57Create an empty dictionary using dict().
11:17As objects, dictionaries have methods.
11:31Two important dictionary methods are keys() and values()
11:47Example using the keys() method.
12:15The type returned by the keys() method is a list.
12:24Example using the values() method.
12:44The keys and values returned by the key() and values() methods are in the same order.
13:36The pop() method.
13:47pop() method example.
14:38The popitem() method.
14:59popitem() method example.
16:06The type of the key-value pair returned by the popitem() method is a tuple.
16:33Test for a key in the dictionary using the “in” keyword.
16:39Clear the console window using cntrl-L.
16:57Example of using “in” for a key that is in the dictionary.
17:37Example of using “in” for a key that is not in the dictionary.
18:16Using the dir() builtin function to list out all the attributes and methods of a dictionary.
18:35Attributes and methods beginning with a double underscore (__) are intended to be private.
19:05Getting information about dictionary methods using internet search.
19:24Python.org is the “official” web site for Python.
19:38Finding dictionary (mapping types) documentation in the “Builtin Types” documentation.
20:09Cannot create copies of dictionaries by assignming the dictionary name.
20:17Dictionary name assignment example.
21:24Example of creating a copy of a dictionary using dict().
22:39Use caution when using assignment with dictionary names.
23:22More complicated dictionary example using objects as values.
24:11Example builds on code from the last video: Python With Spyder 11: Inheritence.
24:16Review objects defined in PythonWithSpyder11.py
24:56Execute object definition code from PythonWithSpyder11.py
26:00Create a dictionary with objects as values.
27:49Print the dictionary (from the example with objects as values).
28:28Accessing an attribute of an object in the dictionary.
29:00“AttributeError when accessing an attribute not defined for an object in the dictionary.
29:26Second example of accessing attributes of objects in the dictionary.
30:04Summary of the dictionary example using objects.
30:44Conclusion

Next Video: For Loops

AWS Windows Instance Set Up: Wrapping Up

This video is the last part in a series of videos that show how to set up a Windows virtual machine (instance) using Amazon Web Services and then provision it with Python and R (and some additional software) so that it can serve as a platform for doing some data science. In this video I do some checking to see whether or not the software we installed in the last video works and I also show how to create an AWS AMI (Amazon Machine Image) which will allow new instances to be launched that are already fully provisioned (and also serves as a back-up of the work done provisioning the instance).

See the previous video in this series for instructions on installing the software.

The video is about 15 minutes long.

AWSSetUpWrapUp-DSSThumb

Changes/Corrections

At time 1:28, the video discusses moving from the Start Screen to the All Apps screen. As discussed in the Step 3 video, the method shown in this video is obsolete and replaced by a small white arrow in a circle at the bottom left of the Start Screen.

Video Index

Time Topic
00:00Title Slide
00:03Introduction
00:43Toggle between the Start Screen and Desktop using the Windows Key
01:09Tiles on the Start Screen for the New Software
01:28Show the “All Apps” Screen
02:00Check that the ClamWin programs are installed
02:18Check that the ClamSentinel programs are installed
02:29Check that the Anaconda Python programs are installed
03:00Check that the R programs are installed
03:29Check that the Rstudio program is installed
03:36Check that the CYGWIN program is installed
04:20Return to the Start Screen
04:27Launch the Anaconda Python Command Prompt
05:05Launch the Spyder IDE
05:45Launch Rstudio
06:24Launch CYGWIN
07:31What to do if there is a problem
07:50Uninstall programs — Open the Control Panel
08:04Open Uninstall Programs
08:26After uninstall, may need to remove program folder
08:50Launch file explorer and navigate to the C: drive
08:56The location of the Anaconda Python and CYGWIN program folders
09:05Should remove Anaconda Python and CYGWIN folders when uninstalling and then reinstalling these programs
09:18Location of the R and Rstudio program folders
09:25May need to remove the R and Rstudio folders when reinstalling these programs
09:36Turn off virus software when installing or reinstalling the software
09:43Turning off ClamSentinel
09:59Logging out of the Windows Instance
10:25EC2 Dashboard and Instances Dashboard
11:01Identifying your instance
11:27Good idea to create an AMI (Amazon Machine Image)
11:52Select the instance
12:20Create the AMI (Amazon Machine Image)
12:30Name and describe the image
13:16Navigate to the AMI (under Images)
13:44Image serves as a backup
14:10Conclusion

AWS Windows Instance Set Up Step 4: Installing Open Source Software for Data Science

This video is part 4 in a series of 5 videos that show how to set up a Windows virtual machine (instance) using Amazon Web Services and then provision it with Python and R (and some additional software) so that it can serve as a platform for doing some data science. In this video I show how to get an obtain open source software and install it on the Windows instance. The primary software I am installing is Python 2.7 and the statistics software R with the intention of using this platform to do some data science.

The software installed in this video is Firefox, ClamWin anti virus software, Clam Sentinel real-time virus scanning software based on ClamWin, Anaconda Python, the R statistics package, and the cygwin unix-like tools for Windows.

NOTE: It is very important to turn off the virus scanning software Clam Sentinel while installing the rest of the software (especially Anaconda and cygwin).

A web page with the links for provisioning the Windows instance can be found at:

DataScienceSource.com/WindowsInstanceProvisioning

The video is about 30 minutes long.

AWSSetStep4-DSSThumb

Changes/Corrections

  • The web page at DataScienceSource.com/WindowsInstanceProvisioning has been updated since the video was originally made. The links to the software are not exactly the same as in the video since they have all been updated to point to the latest versions. Also, a few additional warnings have been added. None of this really affects the usefulness of the video.
  • At time 27:14 in the video, I begin to discuss how to test if the “chere” feature of the cygwin software has been successfully installed. Things have changed slightly. Open the File Explorer as indicated in the video. Next double click on the Documents folder to change into that folder. Then, right-click in the blank space (under the “This folder is empty.” message) and you should then see “Bash Prompt Here” in the context menu. “Bash Prompt Here” no longer shows up at the top level (the “This PC” level) in the File Explorer.

Video Index

Click on the topics below to jump to that location in the video.

Time Topic
00:00Title Slide
00:03Introduction
00:49Log into the Windows Instance at AWS
02:53The Windows Instance Desktop
03:00Using the Windows button to switch to the Start Screen
03:06Launch Internet Explorer
03:13The Internet Explorer Warning about Browsing on Servers
03:39Internet Explorer is set to a very high level of security
04:11The links page at DataScienceSource.com for provisioning the instance
05:05Installing the Firefox browser
05:19Add the Firefox web site to the list of trusted sites for IE
07:04Close Internet Explorer
07:30Navigate again to the provisioning links page at DataScienceSource.com
08:24The Windows Server 2012 Instance Lacks Antivirus Software
08:43It is hard to find free or open course antivirus software for Windows Server
09:15Free Open-Source Anitvirus Software: ClamWin and ClamSentinel
09:40Install ClamWin
10:04Show the Recent Downloads list for the Firefox browser using the arrow icon
11:34Install ClamSentinel
13:42ClamSentinal is in the System Tray
13:48The ClamSentinel Menu — Stopping and Starting the ClamSentinel virus scanning
14:29Discussion of Python and Ananconda Python
15:30Discussion of Python 2.7 and 3.4
16:42Must turn off the Antivirus software to install Anaconda Python
17:06Turning off the ClamSentinel antivirus software
17:29Installing Anaconda Python
17:45The Firefox download progress icon and recent downloads
19:15Installing R
20:52Installing Rstudio
21:58Disucssion of CYGWIN
22:26Installing CYGWIN
24:25Installing the chere package (or feaqure) of CYGWIN
26:16Compllete the installation of chere
27:14Check that chere is installed
28:20Restart the antivirus software ClamSentinel
28:51Conclusion

Next Video: Wrapping Up

AWS Windows Instance Set Up Step 3: Connecting to an AWS Windows Instance

This video is part 3 in a series of 5 videos that show how to set up a Windows virtual machine (instance) using Amazon Web Services and then provision it with Python and R (and some additional software) so that it can serve as a platform for doing some data science. In this video I show how to connect to an Windows instance at AWS from a Windows compute using Microsoft Remote Desktop Connection.

Note: Microsoft Remote Desktop Connections clients exist for most operating systems including for Mac, iPad, and Linux. If you are not using a Windows computer to connect to the instance, after watching this video, you should be able to set up a client for you own computer without much trouble. Important for Mac OS X users: Use the Remote Desktop App that you can obtain from the Apple App Store.

The video is about 17 minutes long.

AWSSetUpStep3-DSSThumb

Changes/Corrections

Starting at time 14:06 in the video, I explain how to move from the Start Screen to the screen that lists all of the Apps. This has changed in more recent versions of Windows 8. Now, at the bottom of the Start Screen on the left-hand side is a small white arrow in a white circle that takes you to the page of all the Apps. This is much more intuitive and easier than the right-clicking approach discussed in the video. Click here to see an image of the updated Start Page.

Video Index

Click on the topics below to jump to that location in the video.

Time Topic
00:00Title Slide
00:03Introduction
00:23EC2 Table of Instances
01:02Navigate to the EC2 Instance Table
02:08Review Information Saved in the Instance Folder
04:28Select Instance in the EC2 Instance Table
04:38The “Connect” Button
04:41The “Connect to Your Instance” Dialog
04:46The “Download Remote Desktop File” Button
05:19The Remote Desktop File default name is the IP number
06:04The “Get Password” Button
06:25Input the Key Pair Information from the .pem file
07:03The Private Key
07:22Decrypting the password
07:34Save the password in the Instance Data Excel file
08:03Close the “Connect to Your Instance” dialog
08:18Run the .rdp file for the instance
08:39Ignore the warning
08:48Login name and password
09:24Ignore the certificate warning and click “Yes”
09:37Instance Windows 8 Start Screen
10:24Return to the local computer
10:35Reduce the Remote Desktop Protocol from fullscreen to a Window
10:53Minimize the Remote Desktop Connect to the local computer taskbar
11:05Return to the AWS Windows Instance
11:29Move the Remote Desktop Protocol top bar around
11:38Windows 8 Start Screen
11:46In Windows 8, the Start Screen replaces the Start Menu
11:47Windows 8 Programs List
12:11Brief Introduction to Windows 8
12:21Moving between the Start Screen and the Desktop
12:07Programs are now called “Apps”
12:28Moving to the Desktop using the Desktop Tile on the Start Screen
12:36Using the Windows Key
13:02Move to the Windows 8 Start Screen using the Windows Key
13:15Move back to the Desktop using the Windows Key
13:21Move back to the Windows 8 Start Screen
13:25Expose the Charms (including the Settings)
13:41Show the Settings Menu including the Power Button
14:00Hide the Setting Menu
14:06Showing the “All Apps” Page
15:13Return to the Start Screen from the All Apps page using the Windows Key
15:19Determine the disk size and available disk space.
15:39Switch to the Desktop
15:41Open File Explorer
15:46Switch to the Computer folder
15:53Disk C size and space available
16:02Conclusion

Next Video: Step 4 — Provisioning the Instance

AWS Windows Instance Set Up Step 2: Launching a Windows Instance

This video is part 2 in a series of 5 videos that show how to set up a Windows virtual machine (instance) using Amazon Web Services (AWS) and then provision it with Python and R (and some additional software) so that it can serve as a platform for doing some data science. In this video I show how to launch a Windows instance using AWS.

The video is about 20 minutes long.

AWSSetUpIntro-DSSThumb

Changes/Corrections

  • The section of the video from time 3:57 to 4:55 is obsolete and can be ignored. This section of the video begins with a pop-up dialog that is titled “Boot from General Purpose SSD.” General purpose SSD is now the default and your virtual instance will automatically have a 30GB SSD drive. You will not be prompted as shown in the video.
  • The Actions menu which appears in the video at time 13:16 has changed. The large number of items in the old menu have been consolidated into groups in the new menu. Specifically, the Actions category of the old menu has now been changed to the Instance State menu item which leads to a sub-menu with the items Start, Stop, Reboot, and Terminate.

Video Index

Click on the topics below to jump to that location in the video.

Time Topic
00:00Title Slide
00:04Introduction
00:33Navigating to AWS
00:50Loggin in to AWS
01:21Page of Links to All AWS Services
01:31Navigating to the page of AWS Sevices if already logged in
01:44EC2 link and dashboard
02:05Launch instance button
02:13List of Image Types (Amazon Machine Images)
02:30Scroll to MS Windows Server 2012 R2 Base image (Free Tier Eligible)
03:24Choose instance types
04:00Select device types (SSD 30GB)
04:57Security Issues
05:30IAM (Identity and Access Management) – Accept Defaults
06:08A Secure Connection requires a Key Pair
06:40Create a new key pair
07:52Create a folder on the local computer for the key pair
09:53Download the key pair
10:45Launch the instance
10:52Launch status – initializing
11:19Instance console
12:42Give the instance a name
13:11Action Menu (Terminate, Reboot, Stop, and Start)
13:36Select and Instance for the Action
14:53Instance state is “running”
15:08Time delay in launching an instance
15:22Save the data about an instance
18:40Instance IP and DNS Name can change on stop and start
19:13Conclusion

Next Video: Step 3 — Connecting to the Instance

AWS Windows Instance Set Up Step 1: Setting Up An Amazon Web Services Account

This video is step 1 in a series of 5 videos that show how to set up a Windows virtual machine (instance) using Amazon Web Services and then provision it with Python and R (and some additional software) so that it can serve as a platform for doing some data science. In this video I discuss setting up an Amazon Web Services account, using the “Free Tier” services, and setting up a billing alert.

The video is about 11 minutes long.

AWSSetUpStep1-DSSThumb

Changes/Corrections

  • The choices that show up in the billing alert notification menu have changed (about time 9:30 in the video). Instead of seeing and selecting the account owner’s name (in my case George), you should select Notify Me.

Video Index

Click on the topics below to jump to that location in the video.

Time Topic
00:00Title slide
00:04Introduction
00:36Navigate to AWS (aws.amazon.com)
01:09Link for creating free account
01:36Explore information on free (“free tier”) services
02:59Information on EC2 (Elastic Compute Cloud)
03:42Begin Sign-Up
03:56Need Two Things for Account – Credit Card and Phone Access
04:54Log in to your AWS Account
05:50AWS Services Page
06:17AWS Regions
07:10Need Billing Alert – Billing and Cost Management
08:08Set up Billing Alert
10:05Conclusion

Next Video: Step 2 — Launching a Windows Instance

AWS Windows Instance Set Up Introduction

This video provides a brief introduction to a series of 5 videos that follow which show how to set up a Windows virtual machine (instance) using Amazon Web Services and then provision it with Python and R (and some additional software) so that it can serve as a platform for doing some data science.

The video is a little less than 2 minutes long.

AWSSetUpIntro-DSSThumb

Next Video: Step 1 — Setting Up an AWS Account