![how to download spark 2.7 tgz on windows 10 how to download spark 2.7 tgz on windows 10](https://i.ytimg.com/vi/lPYSlqe5Pu0/maxresdefault.jpg)
![how to download spark 2.7 tgz on windows 10 how to download spark 2.7 tgz on windows 10](https://miro.medium.com/max/1400/1*xxWck3aRItzb9AH427ETpg.png)
Let’s first check if they are already installed or install them and make sure that PySpark can work with these two components. PySpark requires Java version 7 or later and Python version 2.6 or later. The official Spark documentation does mention about supporting Windows. So I had to first figure out if Spark and PySpark would work well on Windows. Often times, many open source projects do not have good Windows support. In case you need a refresher, a quick introduction might be handy.
How to download spark 2.7 tgz on windows 10 how to#
You do not have to be an expert, but you need to know how to start a Command Prompt and run commands such as those that help you move around your computer’s file system. I am also assuming that you are comfortable working with the Command Prompt on Windows. So the screenshots are specific to Windows 10. In this post, I describe how I got started with PySpark on Windows. Spark supports a Python programming API called PySpark that is actively maintained and was enough to convince me to start learning PySpark for working with big data. While I had heard of Apache Hadoop, to use Hadoop for working with big data, I had to write code in Java which I was not really looking forward to as I love to write code in Python. I decided to teach myself how to work with big data and came across Apache Spark.