DataFrame Operations
DataFrame Operations
Transformations
Selection and Filtering
# Select columns
df.select("name", "age", "city")
df.select(col("name"), col("age") + 1)
# Select with SQL expressions
df.selectExpr("name", "age + 1 as age_plus_one", "UPPER(city) as city_upper")
# Filter rows
df.filter(col("age") > 21)
df.where(col("status") == "active")
df.filter("age > 21 AND status = 'active'")Joins
# Inner join
df1.join(df2, df1["id"] == df2["id"], "inner")
# Left outer join
df1.join(df2, "id", "left")
# Cross join
df1.crossJoin(df2)Grouping and Aggregation
Sorting
Set Operations
Column Operations
All Transformations
Method
Description
Actions
Method
Description
Examples
Temporary Views
Read Operations
Supported Read Formats
Format
Method
Write Operations
Write Modes
Mode
Behavior
Last updated
