You work for MDFT Academy, a well-known training agency. You have a Fabric workspace that uses the default Spark starter pool and runtime version 1.2.
You plan to read a CSV file named students_raw.csv in a lakehouse, select columns, and save the data as a Delta table to the managed area of the lakehouse. students_raw.csv contains 12 columns.
You have the following code:
from pyspark.sql.functions import year
(spark
.read
.format("csv")
.option("header", true)
.load("Files/students_raw.csv")
.select("SalesOrderNumber", "OrderDate", "StudentName", "UnitPrice")
.withColumn("Year", year("OrderDate"))
.write
.partitionBy("Year")
.saveAsTable("students")
)
Which statements are true?
Choose all correct answers from the options below.
Explanations for each answer: