View
 

MachineLearning

Page history last edited by doug chang 1 year, 8 months ago

 

CS229 Machine Learning

 

Meets every Thu starting 4/22 at 7pm at Hackerdojo.

 

This class is based on the Stanford cs229 material developed by Professor Andrew Ng. We have permission to use his materials from the course.

 

We are trying something things differently to emphasize the work related nature of the student population. We have sponsorship from Amazon for Elastic Map Reduce and AWS so students can implement versions of the algorithms presented in class on a cluster. We should have something to report back to Professor Ng at the end of class. We have a wide variety of people from industry, the goal is SHDH with some structure so people can meet other people to do some cool  machine learning projects. Free compute time.

 

The course videos are on youtube or they can be downloaded from this site.  The assignments, handouts, and lecture notes are available from the course website: http://www.stanford.edu/class/cs229/

 

We will meet once a week for ~10 weeks to discuss the lecture material and problem sets.

 

We also have a volunteer willing to lead and teach the class, people who have a background in this area and who have taken the class before.

 

Please sign up in advance. We are limiting enrollment because of limited resources (time of volunteer instructors).

 

Volunteer Instructor: Mike Bowles:http://www.linkedin.com/in/mikebowles

 

 

First meeting on 4/22 will cover administration details, hw1 and review of lecture 1 on youtube site of cs229.

 

http://www.youtube.com/results?search_query=stanford+cs229&search_type=&aq=1m&oq=cs229

 

Lecture 1: http://www.youtube.com/watch?v=UzxYlbK2c7E (useless, skip it)

Lecture 2: http://www.youtube.com/watch?v=5u4G23_OohI

Lecture 3: http://www.youtube.com/watch?v=HZ4cvaztQEs

Lecture 4: http://www.youtube.com/watch?v=nLKOQfKLUks

Lecture 5: http://www.youtube.com/watch?v=qRJ3GKMOFrE

Lecture 6: http://www.youtube.com/watch?v=qyyJKd-zXRE

 

Downloadable Fall 2009/2010 cs229 RealAudio lectures/PS

 

CS229 lectures

cs229Stanford Online - 9 21 2009.rm, Lecture 1

Stanford Online - 9 23 2009.rm , Lecture 2

Stanford Online - 9 25 2009.rm, PS1 Linear Algebra Review

Stanford Online - 9 28 2009.rm , Lecture 3

Stanford Online - 9 30 2009.rm , Lecture 4

Stanford Online - 10 5 2009.rm , Lecture 5

Stanford Online - 10 7 2009.rm Lecture 6

Stanford Online - 10 12 2009.rm Lecture 7

Stanford Online - 10 14 2009.rm Lecture 8

Stanford Online - 10 19 2009.rm

Stanford Online - 10 21 2009.rm

Stanford Online - 10 26 2009.rm

Stanford Online - 10 28 2009.rm

Stanford Online - 10 31 2009.rm

Stanford Online - 11 2 2009.rm

Stanford Online - 11 4 2009.rm

Stanford Online - 11 9 2009.rm

Stanford Online - 11 11 2009.rm

Stanford Online - 11 16 2009.rm

Stanford Online - 11 18 2009.rm

Stanford Online - 11 30 2009.rm

Stanford Online - 12 2 2009.rm

 

 

 

4/21/2010: 20 people signed up

 

 

HW #1  Notes:

To install Octave under windows, you don't need to download additional packages, install Cygwin for windows and check the Octave AND Gnuplot package under Math when running setup.exe for Cygwin.

 

If Octave sucks for you as it did me, try R: http://cran.r-project.org/

 

public cs229 course page: http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1

Past hw1: http://see.stanford.edu/materials/aimlcs229/problemset1.pdf

Past Solutions: http://see.stanford.edu/materials/aimlcs229/ps1_solution.pdf

Past hw2: http://see.stanford.edu/materials/aimlcs229/problemset2.pdf

Past Solutions: http://see.stanford.edu/materials/aimlcs229/ps2_solution.pdf

 

 

HW1:

Problem 1b,c Solutions: cs229-public_hw1_1

Problem 1b,c & LWLR implementation in python: cs229-hw1_1b_py  

"Public" 2a solution in matlab: cs229-public_hw1_2

Problem 2a,b Solutions:cs2292abc.pdf

2d solutions (Matlab)cs229_hw1_2

Problem 3a,b,c Solutions: Problem 3abc.pdf

 

Stanford takes the honor code very seriously, it would be best to not post any solutions or of your work on a website. If a student is caught not doing their own work, they are expelled. It is best to not make this an alternative through what we are doing here at HD.

 

 

Generative and Discriminative Learning Notes

http://www.cs.cmu.edu/~tom/mlbook/NBayesLogReg.pdf

 

 

Amazon AWS/EMR Resources

Anything written by Jinesh Varia from Amazon. His documentation is extremely well written.  He will be here to talk to the class on 6/17.

http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1633

Hadoop MR by Jinesh Varia:

http://developer.yahoo.net/blogs/theater/archives/2009/07/amazon_elastic_mapreduce.html

 

You have a choice, you can either use Amazon EMR, elastic map reduce

EC2 Resources:

http://www.cs.washington.edu/education/courses/490h/08au/ec2.htm

or you can use Hadoop on AWS; see Cloudera

 

 

 

Map Reduce Assignments

 

Below is a list of 4 assignments for map reduce. You can use either Amazon EMR or Hadoop MR for the assignments.

 

http://code.google.com/edu/submissions/uwspr2007_clustercourse/listing.html 

http://code.google.com/edu/submissions/uwashington-scalable-systems/

 

The UW 490H class materials, 2008 are very good.

Assignment 1: Inverted Index: assignment1.pdf

Assignment 2: Run Page Rank on Wikipedia: assignment2.pdf

Assignment 3: create a tiled series of Rendered Map Images from Public TIGER data:assignment3.pdf geosource.zip

Assignment 4: Push data from Assignment 3 onto Amazon EC2 and create servers to publish data. assignment4.pdf  ec2source.zip

 

 

 

 

Doug Chang

doug.chang@hackerdojo.com

 

 

 

Comments (11)

fenn said

at 4:52 pm on Mar 15, 2010

is there a particular language i need to know for this class? (i.e. matlab?)

doug chang said

at 5:10 pm on Mar 15, 2010

no, you will have to be a member of HackerDojo though.

John Nagle said

at 9:54 pm on Apr 20, 2010

Some early notes on the materials:

- The notation drives me nuts. Superscripts are sometimes used as exponents, sometimes used as indices (see problem 1a), and even used as footnote numbers in some places. Subscripts can be indices, or they can be parameters to functions or operators. The style follows Lewis Carroll: `When I use a symbol,' Humpty Dumpty said, in rather a scornful tone, `it means just what I choose it to mean -- neither more nor less.' .

- The type in formulas is so tiny, and in such lightweight TeX fonts, that it won't print readably at 300 DPI. (On laser printers, try 600DPI and a fresh print cartridge. Or zoom in online.)

- In the notes, expressions are generally under-parenthesized, and the precedence of operators is non-obvious. The trace operator, "tr", seems to have precedence above addition but below multiplication. So tr AB = tr (AB), but tr A + B = (tr A) + B. That's reasonable enough, but not a universal convention. a(x) can be either a*x or a of x, depending on context and definitions. There's widespread use of single-letter function names. I'm finding it necessary to rewrite some of the expressions with more parentheses just so I can figure out what's being said.

John Nagle said

at 7:21 pm on Apr 21, 2010

Are the data files mentioned in the problem sets available? "/afs/ir/class/cs229/ps/ps1/q1x.dat", etc.? Stanford's Andrew file system doesn't allow public access.

doug chang said

at 10:02 pm on Apr 21, 2010

Data is available on ps1 Data blue hyperlink labeled data right next to PS1 download.

John Nagle said

at 9:27 pm on Apr 25, 2010

Problem 1b, the minimum is somewhere near [-3.05305 0.89175 1.62786]. Correct?

John Nagle said

at 9:23 pm on Apr 28, 2010

"Stanford Online - 10 14 2009.rm" is truncated. It's only the first half hour or so.

peter.harrington said

at 11:23 am on May 13, 2010

The videos are also available on iTunes, I had to resort to this after Youtube screwed up lecture 6.

john.leung said

at 9:18 am on May 14, 2010

I watch the real media ones, they are a yr more recent I believe and slightly better imo.

lance.norskog said

at 8:37 pm on May 17, 2010

Was there talk of a mailing list? I'm goksron@yahoo.com.

Lance Norskog

peter.harrington said

at 7:02 am on May 19, 2010

I believe this page is retired and we are now using: http://machinelearning123.pbworks.com/FrontPage

You don't have permission to comment on this page.