Sample.txt file
firstName,lastName,email,phoneNumber
John,Doe,john@doe.com,0123456789
Jane,Doe,jane@doe.com,9876543210
James,Bond,james.bond@mi6.co.uk,0612345678
Program
## set up SparkSession
from pyspark.sql import SparkSession
spark=SparkSession.builder\
.master("local")\
.appName("PySpark Create RDD example")\
.config("spark.some.config.option", "some-value")\
.getOrCreate()
df = spark.read.load("C:/Users/mhtpr/Documents/sample1.txt",format="csv", sep=",", inferSchema="true", header="true")
from pyspark.sql import SparkSession
spark=SparkSession.builder\
.master("local")\
.appName("PySpark Create RDD example")\
.config("spark.some.config.option", "some-value")\
.getOrCreate()
df = spark.read.load("C:/Users/mhtpr/Documents/sample1.txt",format="csv", sep=",", inferSchema="true", header="true")
Output:
df.show()
+---------+--------+--------------------+-----------+ |firstName|lastName| email|phoneNumber| +---------+--------+--------------------+-----------+ | John| Doe| john@doe.com| 123456789| | Jane| Doe| jane@doe.com| 9876543210| | James| Bond|james.bond@mi6.co.uk| 612345678| +---------+--------+--------------------+-----------+
df.printSchema()
root |-- firstName: string (nullable = true) |-- lastName: string (nullable = true) |-- email: string (nullable = true) |-- phoneNumber: long (nullable = true)