Pandas selecting by label sometimes return Series, sometimes returns DataFrame
Granted that the behavior is inconsistent, but I think it's easy to imagine cases where this is convenient. Anyway, to get a DataFrame every time, just pass a list to loc
. There are other ways, but in my opinion this is the cleanest.
In [2]: type(df.loc[[3]])
Out[2]: pandas.core.frame.DataFrame
In [3]: type(df.loc[[1]])
Out[3]: pandas.core.frame.DataFrame
The TLDR
When using loc
df.loc[:]
= Dataframe
df.loc[int]
= Dataframe if you have more than one column and Series if you have only 1 column in the dataframe
df.loc[:, ["col_name"]]
= Dataframe if you have more than one row and Series if you have only 1 row in the selection
df.loc[:, "col_name"]
= Series
Not using loc
df["col_name"]
= Series
df[["col_name"]]
= Dataframe
You have an index with three index items 3
. For this reason df.loc[3]
will return a dataframe.
The reason is that you don't specify the column. So df.loc[3]
selects three items of all columns (which is column 0
), while df.loc[3,0]
will return a Series. E.g. df.loc[1:2]
also returns a dataframe, because you slice the rows.
Selecting a single row (as df.loc[1]
) returns a Series with the column names as the index.
If you want to be sure to always have a DataFrame, you can slice like df.loc[1:1]
. Another option is boolean indexing (df.loc[df.index==1]
) or the take method (df.take([0])
, but this used location not labels!).