R summary() equivalent in numpy
No. You'll need to use pandas
.
R is for language for statistics, so many of the basic functionality you need, like summary()
and lm()
, are loaded when you boot it up. Python has many uses, so you need to install and import the appropriate statistical packages. numpy
isn't a statistics package - it's for numerical computation more generally, so you need to use packages like pandas
, scipy
and statsmodels
to allow Python to do what R can do out of the box.
If you are looking for details like summary() in R i.e
- 5 point summary for numeric variables
- Frequency of occurrence of each class for categorical variable
To achieve above in Python you can use df.describe(include= 'all').
1. Load Pandas in console and load csv data file
import pandas as pd
data = pd.read_csv("data.csv", sep = ",")
2. Examine first few rows of data
data.head()
3. Calculate summary statistics
summary = data.describe()
4. Transpose statistics to get similar format as R summary() function
summary = summary.transpose()
5. Visualize summary statistics in console
summary.head()