### Software Versions and Tools JDK8 Spark-2.4.3: [Download](https://blog-1310034074.cos.ap-hongkong.myqcloud.com/BigData/spark-2.4.3-bin-hadoop2.7.tgz) Hadoop-2.7.1: [Download](https://blog-1310034074.cos.ap-hongkong.myqcloud.com/BigData/hadoop-2.7.1.tar.gz) winutils-master: [Download](https://blog-1310034074.cos.ap-hongkong.myqcloud.com/BigData/winutils-master.zip) --- ### Installation Steps **1. Install Hadoop** Unzip the *winutils* and *Hadoop* compressed packages. When using IDEA to develop **spark** programs, you need to simulate the *Hadoop* environment in the development environment. Otherwise, every time you need to hit the jar to the cluster environment to execute the debugging program, which will seriously affect the development efficiency. winutils is the Hadoop debugging environment tool required on Windows system, which contains some essential tools needed to debug Hadoop and Spark on Windows. Enter the winutils directory, copy and paste all its contents into the Hadoop installation directory's bin directory, and add or replace some files. Right-click *My Computer - Properties - Advanced System Settings - Environment Variables*, create a new system variable, set the variable name to **HADOOP_HOME**, and the variable value is the file directory address replaced in the previous step.  Find the **Path** variable, double-click it to open the edit dialogue, select New Variable, and point to the bin directory of Hadoop.  Open the **etc** directory under the Hadoop folder, modify the **hadoop.env.cmd** file, and change JAVA_HOME to the address pointed to by the system variable.  --- **2.Install Python** As Hadoop2.7 and Spark 2.4 can only use Python3.6, we use **Anaconda** to build the Python environment. After installing Anaconda, open Anaconda Navigator, select **Environments** and create a new Python3.6.13 environment.  --- **3.Install Spark** Unzip Spark and put it in the same directory as Hadoop, and configure environment variables. Copy **PySpark** in the Spark directory to the Lib directory in the Python environment.  Enter the Scripts directory in the Python environment, and use *pip install py4j* to install **py4j**. Py4J is a library written in Python and Java. Through Py4J, Python programs can dynamically access Java objects in the Java virtual machine, and Java programs can also call back Python objects.  Open cmd and enter *spark-shell*. At this point, the Spark configuration is successful. --- **4.Running PySpark** Write a test code in **Spyder.** Here, the wordcount program that counts the number of occurrences of words is used. After writing, save it for easy operation. ```python """ @author: JackyMu """ from pyspark import SparkConf, SparkContext conf = SparkConf().setAppName("WordCount").setMaster("local") sc = SparkContext(conf=conf) inputFile = "" #file location textFile = sc.textFile(inputFile) wordCount = textFile.flatMap(lambda line: line.split(" ")).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b) wordCount.foreach(print) ``` Open the **Anaconda prompt**, enter activate python36 on the command line to activate the python environment, use the command to run the program file saved in the previous step.  --- ### Summary This blog briefly introduces the installation and configuration of PySpark under windows. When executing tasks, you can also enter localhost:4040 in the browser to enter the task program control page started by spark. Loading... ### Software Versions and Tools JDK8 Spark-2.4.3: [Download](https://blog-1310034074.cos.ap-hongkong.myqcloud.com/BigData/spark-2.4.3-bin-hadoop2.7.tgz) Hadoop-2.7.1: [Download](https://blog-1310034074.cos.ap-hongkong.myqcloud.com/BigData/hadoop-2.7.1.tar.gz) winutils-master: [Download](https://blog-1310034074.cos.ap-hongkong.myqcloud.com/BigData/winutils-master.zip) --- ### Installation Steps **1. Install Hadoop** Unzip the *winutils* and *Hadoop* compressed packages. When using IDEA to develop **spark** programs, you need to simulate the *Hadoop* environment in the development environment. Otherwise, every time you need to hit the jar to the cluster environment to execute the debugging program, which will seriously affect the development efficiency. winutils is the Hadoop debugging environment tool required on Windows system, which contains some essential tools needed to debug Hadoop and Spark on Windows. <!-- more --> Enter the winutils directory, copy and paste all its contents into the Hadoop installation directory's bin directory, and add or replace some files. Right-click *My Computer - Properties - Advanced System Settings - Environment Variables*, create a new system variable, set the variable name to **HADOOP_HOME**, and the variable value is the file directory address replaced in the previous step.  Find the **Path** variable, double-click it to open the edit dialogue, select New Variable, and point to the bin directory of Hadoop.  Open the **etc** directory under the Hadoop folder, modify the **hadoop.env.cmd** file, and change JAVA_HOME to the address pointed to by the system variable.  --- **2.Install Python** As Hadoop2.7 and Spark 2.4 can only use Python3.6, we use **Anaconda** to build the Python environment. After installing Anaconda, open Anaconda Navigator, select **Environments** and create a new Python3.6.13 environment.  --- **3.Install Spark** Unzip Spark and put it in the same directory as Hadoop, and configure environment variables. Copy **PySpark** in the Spark directory to the Lib directory in the Python environment.  Enter the Scripts directory in the Python environment, and use *pip install py4j* to install **py4j**. Py4J is a library written in Python and Java. Through Py4J, Python programs can dynamically access Java objects in the Java virtual machine, and Java programs can also call back Python objects.  Open cmd and enter *spark-shell*. At this point, the Spark configuration is successful. --- **4.Running PySpark** Write a test code in **Spyder.** Here, the wordcount program that counts the number of occurrences of words is used. After writing, save it for easy operation. ```python """ @author: JackyMu """ from pyspark import SparkConf, SparkContext conf = SparkConf().setAppName("WordCount").setMaster("local") sc = SparkContext(conf=conf) inputFile = "" #file location textFile = sc.textFile(inputFile) wordCount = textFile.flatMap(lambda line: line.split(" ")).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b) wordCount.foreach(print) ``` Open the **Anaconda prompt**, enter activate python36 on the command line to activate the python environment, use the command to run the program file saved in the previous step.  --- ### Summary This blog briefly introduces the installation and configuration of PySpark under windows. When executing tasks, you can also enter <u>localhost:4040</u> in the browser to enter the task program control page started by spark. Last modification:March 28, 2024 © Allow specification reprint Like 1 If you think my article is useful to you, please feel free to appreciate
15 comments
《华纳圣淘沙公司开户流程全解析》→ 官方顾问一对一指导??? 安全联系:183第三段8890第四段9465
《华纳圣淘沙开户步骤详解》→ 」专属通道快速办理??? 安全联系:183第三段8890第四段9465
《华纳圣淘沙账户注册指南》→ 扫码获取完整资料清单?「微?? 安全联系:183第三段8890第四段9465
《新手开通华纳圣淘沙公司账户指南》→ 限时免费咨询开放??? 安全联系:183第三段8890第四段9465
《华纳圣淘沙企业开户标准流程》→ 资深顾问实时解答疑问??? 安全联系:183第三段8890第四段9465
《华纳圣淘沙开户步骤全景图》→ 点击获取极速开户方案??? 安全联系:183第三段8890第四段9465
《华纳圣淘沙账户创建全流程手册》→ 预约顾问免排队服务?9?? 安全联系:183第三段8890第四段9465 《从零开通华纳圣淘沙公司账户》→ 添加客服领取开户工具包?? 安全联系:183第三段8890第四段9465
《官方授权:华纳圣淘沙开户流程》→ 认证顾问全程代办?」?? 安全联系:183第三段8890第四段9465
《华纳圣淘沙开户说明书》→立即联系获取电子版文件??? 安全联系:183第三段8890第四段9465
华纳圣淘沙公司开户新手教程
零基础学会(183-8890-9465薇-STS5099)
华纳圣淘沙公司开户
华纳圣淘沙公司开户保姆级教程(183-8890-9465薇-STS5099)
一步步教你开通华纳圣淘沙公司账户(183-8890-9465薇-STS5099)
华纳圣淘沙公司开户分步图解
首次开户必看:(183-8890-9465薇-STS5099)
华纳圣淘沙全攻略
华纳圣淘沙公司开户实操手册(183-8890-9465薇-STS5099)
华纳圣淘沙开户流程视频教程
手把手教学:(183-8890-9465薇-STS5099)
华纳圣淘沙公司开户
华纳圣淘沙公司开户完全指南(183-8890-9465薇-STS5099)
华纳东方明珠客服电话是多少?(??155--8729--1507?《?薇-STS5099】【?扣6011643?】
华纳东方明珠开户专线联系方式?(??155--8729--1507?《?薇-STS5099】【?扣6011643?】
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
华纳公司合作开户所需材料?电话号码15587291507 微信STS5099
2025年10月新盘 做第一批吃螃蟹的人coinsrore.com
新车新盘 嘎嘎稳 嘎嘎靠谱coinsrore.com
新车首发,新的一年,只带想赚米的人coinsrore.com
新盘 上车集合 留下 我要发发 立马进裙coinsrore.com
做了几十年的项目 我总结了最好的一个盘(纯干货)coinsrore.com
新车上路,只带前10个人coinsrore.com
新盘首开 新盘首开 征召客户!!!coinsrore.com
新项目准备上线,寻找志同道合 的合作伙伴coinsrore.com
新车即将上线 真正的项目,期待你的参与coinsrore.com
新盘新项目,不再等待,现在就是最佳上车机会!coinsrore.com
新盘新盘 这个月刚上新盘 新车第一个吃螃蟹!coinsrore.com
2025年10月新盘 做第一批吃螃蟹的人coinsrore.com
新车新盘 嘎嘎稳 嘎嘎靠谱coinsrore.com
新车首发,新的一年,只带想赚米的人coinsrore.com
新盘 上车集合 留下 我要发发 立马进裙coinsrore.com
做了几十年的项目 我总结了最好的一个盘(纯干货)coinsrore.com
新车上路,只带前10个人coinsrore.com
新盘首开 新盘首开 征召客户!!!coinsrore.com
新项目准备上线,寻找志同道合 的合作伙伴coinsrore.com
新车即将上线 真正的项目,期待你的参与coinsrore.com
新盘新项目,不再等待,现在就是最佳上车机会!coinsrore.com
新盘新盘 这个月刚上新盘 新车第一个吃螃蟹!coinsrore.com
2025年10月新盘 做第一批吃螃蟹的人coinsrore.com
新车新盘 嘎嘎稳 嘎嘎靠谱coinsrore.com
新车首发,新的一年,只带想赚米的人coinsrore.com
新盘 上车集合 留下 我要发发 立马进裙coinsrore.com
做了几十年的项目 我总结了最好的一个盘(纯干货)coinsrore.com
新车上路,只带前10个人coinsrore.com
新盘首开 新盘首开 征召客户!!!coinsrore.com
新项目准备上线,寻找志同道合的合作伙伴coinsrore.com
新车即将上线 真正的项目,期待你的参与coinsrore.com
新盘新项目,不再等待,现在就是最佳上车机会!coinsrore.com
新盘新盘 这个月刚上新盘 新车第一个吃螃蟹!coinsrore.com
新项目准备上线,寻找志同道合的合作伙伴
新项目准备上线,寻找志同道合的合作伙伴coinsrore.com
文章结构紧凑,层次分明,逻辑严密,让人一读即懂。
对生命本质的追问充满哲学思辨。
不错不错,我喜欢看 www.jiwenlaw.com
怎么收藏这篇文章?
看的我热血沸腾啊https://www.jiwenlaw.com/
博主真是太厉害了!!!