less than 1 minute read

I use pd.Series.str.split() a lot for feature engineering, but I recently learned of a useful option expand for this function. This option if set to the non-default value of True will split strings into separate columns.

>>> import pandas as pd
>>> s = pd.Series(
        data = [
            'abc,def,ghi',
            'aaa,bbb,ccc',
            '1,2,3'
        ]
    )
>>> s.str.split(
        pat=',',
    )
0     [abc, def, ghi]
1     [aaa, bbb, ccc]
2           [1, 2, 3]
dtype: object
>>> s.str.split(
        pat=',',
        expand=True,
    )
     0    1     2
0  abc  def   ghi
1  aaa  bbb   ccc
2    1    2     3

Leave a comment