Hdoop学习笔记（HDP）-Part.17 安装Spark2

本文介绍: 添加Spark2服务需要重启HDFS、YARN、MapRe duce2、Hi v e、HBa s e等相关服务。

目录
Part.01 关于HDP
Part.02 核心组件原理
 Part.03 资源规划
 Part.04 基础环境配置
 Part.05 Yum源配置
 Part.06 安装OracleJDK
Part.07 安装MySQL
Part.08 部署Ambari集群
 Part.09 安装OpenLDAP
Part.10 创建集群
 Part.11 安装Kerberos
Part.12 安装HDFS
Part.13 安装Ranger
Part.14 安装YARN+MR
Part.15 安装HIVE
Part.16 安装HBase
Part.17 安装Spark2
Part.18 安装Flink
Part.19 安装Kafka
Part.20 安装Flume

添加Spark2服务
在这里插入图片描述

需要重启HDFS、YARN、MapReduce2、Hive、HBase等相关服务

在CONFIGS-&g t;Ad vance d spark2-env下的content里，将下面内容加#注释掉

export SPARK_HISTORY_OPTS='-Dspark.ui.filters=org.apache.hadoop.security.authentication.server.AuthenticationFilter -Dspark.org.apache.hadoop.security.authentication.server.AuthenticationFilter.params="type=kerberos,kerberos.principal={{spnego_principal}},kerberos.keytab={{spnego_keytab}}"'

在这里插入图片描述
访问页面，http://hdp01.hdp.com:18081/

查看/usr/hdp/3.1.5.0-152/s park2/conf/sp ar k–env.sh

export HADOOP_HOME=${HADOOP_HOME:-/usr/hdp/3.1.5.0-152/hadoop}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/usr/hdp/3.1.5.0-152/hadoop/conf}

/usr/hdp/3.1.5.0-152/hadoop–yarn/conf/yarn–site.xml

spark-shell --master local

val textFile = sc.textFile("file:///root/wordcount_input")

textFile.first()

val textFile = sc.textFile("file:///root/wordcount_input")
textFile.saveAsTextFile("file:///root/output")

ll /root/output/
cat /root/output/part-00000

val textFile = sc.textFile("hdfs://hdp315/testhdfs/tenant1/wordcount_input")
textFile.first()

spark-shell --master local
val textFile = sc.textFile("file:///root/wordcount_input")
val wordCount = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).ruduceByKey((a,b) => a + b)
wordCount.collect()

tar -zxvf /opt/sbt-1.8.2.tgz -C /usr/local/

chmod u+x /usr/local/sbt/sbt

/usr/local/sbt/sbt sbtVersion

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object WordCount {
    def main(args: Array[String]) {
        val inputFile =  "hdfs://hdp315/testhdfs/ranger_yarn/wordcount_input"
        val conf = new SparkConf().setAppName("WordCount")
        val sc = new SparkContext(conf)
        val textFile = sc.textFile(inputFile)
        val wordCount = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)
        wordCount.foreach(println)
    }
}

name := "WordCount Project"

version := "1.0"

scalaVersion := "2.11.12"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"

/usr/local/sbt/sbt package

kinit -kt /root/keytab/ranger_yarn.keytab ranger_yarn
spark-submit --class "WordCount" /root/wordcount-project_2.11-1.0.jar --deploy-mode cluster --master yarn

nc -l 1234

name := "streamPrint Project"

version := "1.0"

scalaVersion := "2.11.12"

libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.3.0",
"org.apache.spark" %% "spark-streaming" % "2.3.0"
)

import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.storage.StorageLevel

object StreamPrint {
    def main(args: Array[String]) {
        val conf = new SparkConf().setAppName("streamPrint")
        val sc = new StreamingContext(conf, Seconds(5))
        val lines = sc.socketTextStream("192.168.111.1", 1234, StorageLevel.MEMORY_AND_DISK)
        if (lines != null) {
                lines.print()
                println("start!")
        }
        sc.start()
        sc.awaitTermination()
    }
}

/usr/local/sbt/sbt package

kinit -kt /root/keytab/ranger_yarn.keytab ranger_yarn
spark-submit --class "StreamPrint" /root/streamprint-project_2.11-1.0.jar --deploy-mode cluster --master yarn