专注细节
努力进步

SML-week1

Installing the Required Software Packages

  • Oracle’s Virtual Box
  • Vagrant automatic VM configuration

Note: If you already have either software package installed, makes sure that the versions are VirtualBox 4.3.28 (or later) and Vagrant 1.7.2 (or later)

Vagrant: to manage the VM

Vagrant is a tool to manage a env with your virtual machines. In cs190, it is used for build the env to learn the spark.

Vagranfile is a configure file with ruby to build the vm from the internet. the first time it would download the sparkvm to your disk to build the development env.

  1. vagrant up

Using pyspark in the ipython notebook

when you build the spark development env, open you safari and enter the http://localhost:8001

Click the ‘upload’ button and upload the ML_lab1_review_student.ipynb then you can answer the problem in this notebook. The lab1 notebook is to review the linear algebra ,the foundation of the math , numpy and the lambda python expression that cs190 need .

The foundation of the numpy and lambda python expression is that i known , the key is the DenseVector. It is a class in pyspark.mllib.linalg. DenseVector is used to store arrays of value for use in pyspark. Note that the DenseVector store all value as np.float64.

The content of the week 1 is finished. Next, we will study the Pragramming in Spark by using pyspark. Good Luck!

来自为知笔记(Wiz)

附件列表

     

    分享到:更多 ()