Apache Spark Deep Learning Cookbook
上QQ阅读APP看书,第一时间看更新

How to do it...

This section walks through the steps for the string conversion to a numeric value in the dataframe:

  • Female --> 0 
  • Male --> 1
  1. Convert a column value inside of a dataframe requires importing functions:
from pyspark.sql import functions
  1. Next, modify the gender column to a numeric value using the following script:
df = df.withColumn('gender',functions.when(df['gender']=='Female',0).otherwise(1))
  1. Finally, reorder the columns so that gender is the last column in the dataframe using the following script:
df = df.select('height', 'weight', 'gender')