winutils启动windows安装

我正在试图在Windows 10上安装Spark 1.6.1,到目前为止,我已经完成了以下操作…

  1. 下载的spark 1.6.1,解压到某个目录,然后设置SPARK_HOME
  2. 下载的scala 2.11.8,解压到某个目录,然后设置SCALA_HOME
  3. 设置_JAVA_OPTION envvariables
  4. 通过下载zip目录从https://github.com/steveloughran/winutils.git下载winutils,然后设置HADOOP_HOME envvariables。 (不知道这是不正确的,我不能克隆目录,因为权限被拒绝)。

当我去火花回家,运行bin \ spark-shell时,

'C:\Program' is not recognized as an internal or external command, operable program or batch file. 

我一定是错过了一些东西,我不知道如何从Windows环境中运行bash脚本。 但希望我不需要明白只是为了得到这个工作。 我一直在关注这个人的教程 – https://hernandezpaul.wordpress.com/2016/01/24/apache-spark-installation-on-windows-10/ 。 任何帮助,将不胜感激。

您需要下载winutils可执行文件,而不是源代码。

你可以在这里下载,或者如果你真的想要整个Hadoop发行版,你可以在这里找到2.6.0的二进制文件。 然后,您需要将HADOOP_HOME设置为包含winutils.exe的目录。

另外, 确保放置Spark的目录是一个不包含空格的目录 ,这是非常重要的,否则它将无法工作。

一旦你设置好了,你就不会启动spark-shell.sh ,而是启动spark-shell.cmd

 C:\Spark\bin>spark-shell log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties To adjust logging level use sc.setLogLevel("INFO") Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.6.1 /_/ Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit server VM, Java 1.8.0_91) Type in expressions to have them evaluated. Type :help for more information. Spark context available as sc. 16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar." 16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar." 16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar." 16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/05/18 19:32:01 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 16/05/18 19:32:01 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar." 16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar." 16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar." 16/05/18 19:32:07 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/05/18 19:32:08 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/05/18 19:32:12 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 16/05/18 19:32:12 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException SQL context available as sqlContext. scala>