Finding nulls in DataFrames
Given a pandas.DataFrame
with some nulls, e.g.:
>>> import numpy as np
>>> import pandas as pd
>>> df = (
pd.DataFrame(
data={
'a': [1, np.nan, 3, 4, 5, np.nan],
'b': [np.nan, 2, 3, 4, 5, 6],
'c': [1, 2, 3, 4, 5, 6],
},
)
)
>>> df
a b c
0 1.0 NaN 1
1 NaN 2.0 2
2 3.0 3.0 3
3 4.0 4.0 4
4 5.0 5.0 5
5 NaN 6.0 6
you can easily list which columns have nulls:
>>> df.columns[pd.isnull(df).sum() > 0].values
array(['a', 'b'], dtype=object)
and which rows have nulls:
>>> df[df.isnull().any(axis='columns')].index.values
array([0, 1, 5])
Leave a comment