pandas select from Dataframe using startswith
Using startswith for a particular column value
df = df.loc[df["SUBDIVISION"].str.startswith('INVERNESS', na=False)]
You can use the str.startswith
DataFrame method to give more consistent results:
In [11]: s = pd.Series(['a', 'ab', 'c', 11, np.nan])
In [12]: s
Out[12]:
0 a
1 ab
2 c
3 11
4 NaN
dtype: object
In [13]: s.str.startswith('a', na=False)
Out[13]:
0 True
1 True
2 False
3 False
4 False
dtype: bool
and the boolean indexing will work just fine (I prefer to use loc
, but it works just the same without):
In [14]: s.loc[s.str.startswith('a', na=False)]
Out[14]:
0 a
1 ab
dtype: object
.
It looks least one of your elements in the Series/column is a float, which doesn't have a startswith method hence the AttributeError, the list comprehension should raise the same error...
To retrieve all the rows which startwith required string
dataFrameOut = dataFrame[dataFrame['column name'].str.match('string')]
To retrieve all the rows which contains required string
dataFrameOut = dataFrame[dataFrame['column name'].str.contains('string')]