Apache Spark for Beginners - Chapter 1.1 : Configure Spark Web UI in Local
Objective
Configure the Spark Web UI in your local environment.
Prerequisites
cd /usr/local/Cellar/apache-spark/3.0.0/libexec/conf
1.2 Copy the spark-defaults.conf template
cp conf/spark-defaults.conf.template conf/spark-defaults.conf
1.3 Add these values
cat <<EOT >> conf/spark-defaults.conf
# Options for the daemons used in the standalone deploy mode
spark.master http://localhost:7077
spark.eventLog.enabled true
spark.eventLog.dir file:///tmp/spark-events
spark.history.fs.logDirectory file:///tmp/spark-events
EOT
2. Configure spark-env.sh
2.1 Copy the spark-env.sh.template
cp conf/spark-env.sh.template conf/spark-env.sh
2.2 Add the below values
cat <<EOT >> conf/spark-env.sh
# Options for the daemons used in the standalone deploy mode
SPARK_MASTER_HOST=localhost
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8080
SPARK_LOCAL_IP=localhost
EOT
3. Start the spark daemons
cd $SPARK_HOME./sbin/start-all.sh
./sbin/start-history-server.sh
3.1 Spark Web UI URLs
Master Node : http://localhost:8080/
Spark Job Node: http://localhost:4040/
History Server : http://localhost:18080/
3.2 Run an example
Clone the project spark-poc, build and run
Refer Chapter 1 for setting up and running a Spark application in local
cd $SPARK_HOME./bin/spark-submit --class dev.template.spark.WordCount \
--master spark://localhost:7077 \
--driver-memory 2g \
--executor-memory 2g \
--executor-cores 4 \
/Users/e1xx/repos/spark-scala-gradle-bootstrap/build/libs/spark-scala-gradle-bootstrap-2.12.0-all.jar \
/Users/e1xx/repos/spark-scala-gradle-bootstrap/src/test/resources/wordcount_intput.txt \
/tmp/wordcount/WordCount_$(date +%F_%T)
3.3 Verify the Spark Master Web UI
3.4 Verify the Spark Job UI while running the application
3.5 Verify the history-server after the job completes
Next:
Chapter 2: Applications-> Jobs -> Stages -> Tasks [TBD]