PySpark enables certain popular methods to create data frames on the fly from rdd, iterables such as Python List, RDD etc.

Method 1 — SparkSession range() method

# Create an Dataframe from range of values
df_range_1 = spark.range(5), truncate = False)
# You can optionally specify start, end and steps as well
df_range_2 = spark.range(start = 1, end = 10, step = 2), False)
Method 2 — Spark createDataFrame() method

# Create Python Native List of Data
_data = [
["1", "Ram"],
["2", "Shyam"],
["3", "Asraf"],
["4", None]
# Create the list of column names
_cols = ["id", "name"]
# Create Data Frame using the createDataFrame method
df_users = spark.createDataFrame(data = _data, schema=_cols)
# Check Data Frame
Method 3 — Spark toDF() method

# From the same data list we create new RDD
_data_rdd = spark.sparkContext.parallelize(_data)
# To check number of partitions of the data
# Create Data Frame from the rdd
df_users_new = _data_rdd.toDF(_cols)
