less than 1 minute read

Given a pandas.DataFrame with some nulls, e.g.:

>>> import numpy as np
>>> import pandas as pd
>>> df = (
    pd.DataFrame(
        data={
            'a': [1, np.nan, 3, 4, 5, np.nan],
            'b': [np.nan, 2, 3, 4, 5, 6],
            'c': [1, 2, 3, 4, 5, 6],
        },
    )
)
>>> df
     a    b  c
0  1.0  NaN  1
1  NaN  2.0  2
2  3.0  3.0  3
3  4.0  4.0  4
4  5.0  5.0  5
5  NaN  6.0  6

you can easily list which columns have nulls:

>>> df.columns[pd.isnull(df).sum() > 0].values
array(['a', 'b'], dtype=object)

and which rows have nulls:

>>> df[df.isnull().any(axis='columns')].index.values
array([0, 1, 5])

Leave a comment