PySpark 4.0.1
If you are having issues running PySpark 4.0.1 with random errors that do not make much sense / relevant to your code, make sure that:
- You are running Python 3.11 (not 3.12 or any other version). I was running 3.12 and getting absolutely irrelevant errors like “executor has crashed” or “Java gateway process exited before sending the driver its port number”. No, error is just that PySpark 4.0.1 does not support Python 3.12 yet.
- You use Java 17 or Java 21. Java 8 is no longer supported by PySpark 4.x. Make sure
JAVA_HOMEenvironment variable points to Java 17 or Java 21 installation (notPATH).
To contact me, send an email anytime or leave a comment below.
