Pandas: Regular expressions with str.contains
The pd.Series.str.contains
method assumes that it is passed a regular expression for the pat
input:
>>> import re
>>> import pandas as pd
>>> df = pd.DataFrame()
>>> df['item'] = [1, 2, 3, 4, 5, 6]
>>> df['size'] = ['SMALL', 'small', 'medium', 'large', 'large', 'large']
>>> df
item size
0 1 SMALL
1 2 small
2 3 medium
3 4 large
4 5 large
5 6 large
>>> df[df['size'].str.contains(pat='small|medium')]
item size
1 2 small
2 3 medium
You can also pass regex flags:
>>> df[df['size'].str.contains(pat='small|medium', flags=re.IGNORECASE)]
item size
0 1 SMALL
1 2 small
2 3 medium
Set regex=False
to treat pat
as a plain character sequence.
Via pandas.pydata.org.
Leave a comment