https://www.free-counters.org

Pandas Tutorial

 




Python Pandas is a Python package providing fast,flexible and expressible data structures designed for manipulation. Python Pandas was developed by Wes Mckinney in 2008 and used for data analysis in python.

 Features of Pandas :

Some important features of Python Pandas are as follows:-

(i) Handling of Data:- Python library provides fast and efficient way to manage and explore the data. It provides two methods or structures as Series and DataFrames, which help us not only to represent data efficiently but also manipulate the data in various ways.

(ii) Input and Output tools:- Pandas provide an extremely simple wide array of built -in tools such as input and output tools for the purpose of reading and writing data.

(iii) Visualise:- Visualising the data is an important part of data science. It makes the results of the study understandable by human eyes. Pandas have in-built ability to help you plot your data and see the various data and see the various kinds of graphs formed.

(iv) Grouping:- With the help of this feature of pandas you can split data into categories of your choice, according to the criteria you set. The GroupBy function splits the data, implements a function and then combines the result.

(v) Merging and joining of datasets:- While analysing data, we constantly need to merge and join multiple datasets to create a final dataset to be able to properly analyse it. Pandas can help to merge various datasets, with extreme efficiency so that we don't have any problems while analysing the data.

(vi) Optimised performance:- Pandas have a really optimised performance, which makes it really fast and suitable for data. The critical code for Pandas is written in C or Cython, which makes it extremely responsive and fast.

Installing Pandas:-

(i) Click on start button and type command prompt

(ii) Right click on command prompt, run as administrator

(iii) Type the command:- pip install pandas

Python Pandas Data Structure:- A DataFrame is a particular way of storing and organising data in a computer so that it can be accessed and worked with in appropriate ways.  The Pandas provides two data structures for processing the data i.e, Series and DataFrame , Which are discussed below:

(1) Series: It is defined as a one-dimensional array that is capable of storing various data types. The row label of the series are called the index. We can easily convert the list tuple, tuple and dictionary into series using "series" method. A Series cannot have multiple columns. It has one parameter:

It has two main components :-
(i) An array of actual data
(ii) An assoiciated array of indexes or data labels.

Creation of Series Object:-
Series type objects can be created in many ways:-

(i) Empty Series using series() method.
(ii) Non Empty Series using list/sequence, ndarray, dictionary or scalar value.

Creating Non Empty Series using Sequences or list:-




Creating Series from array:-


Creating Series using dictionary:-


Creating Series using Scalar Values:-


  • Series object Attributes:-
   Some common attributes:-
(a)  Series.index:- Returns index of the series.
(b) Series.values:- Returns ndarray.
(c) Series.shape:- Returns tuple of the shape of underlying data.
(d) Series.dtype:- Returns dtype.
(e) Series.nbytes:- Return number of bytes of data.
(f) Series.ndim:- Returns the number of dimension.
(g) Series.size:- Returns the no of elements.
(h) Series.indexname:- Assignes index name of the series.
(i) Series.hasnans:- Returns true if there are any NAN.
(j) Series.empty:- Returns true if series object is empty.



head() and tail() Function:-
head(<n>) function fetch the first n rows from a pandas object. If you do not provide any value for n will return the first 5 rows.
tail(<n>)  function fetch the last n rows from a series object. If you do not provide any value for n, will return 5 rows.




Python Pandas DataFrame:-
A DataFrame is another pandas data structure that stores data in two dimensional way. It consists of the following properties:

  • The columns can be heterogeneous types like int, bool, and so on.
  • It can be seen as a dictionary of Series structure where both the rows and columns are indexed. It is denoted as "columns" in case of columns and "index" in case of rows.
   Index labels:-
  • Conceptually it is like a spreadsheet where each value is identifiable with the combination of row index and column name.
  • The indexes can be numbers, letters or strings.
  • It is value-mutable.
  • We can add or delete rows/columns in a DataFrame. So, it is size-mutable.
Creating a DataFrame:-
A DataFrame object can be created in many different ways using 
(i)  2-D dictionaries i.e.
  • dictionaries whose values are list.
  • dictionaries whose values are dictionaries.
  • dictionaries whose values are ndarray.
  • dictionaries whose values are series.
(ii) Two Dimensional(2D) ndarrays.
(iii) From another DataFrame object.
(iv) From list of dictionaries
(v) From csv/text file

2-D Dictionary:-
A two-dimensional dictionary is a dictionary having items as (key:value) where the value part is a data structure of any type:- another dictionary, an ndarray, a series object, a list, a tuple.

Note:- But here the value parts of all the keys should have similar structure and equal lengths.

 Examples of 2-D dictionaries:-
  • 2-D dictionary having values as list

  • 2-D dictionary having values as tuple

  • 2-D dictionary having values as nd array

  • 2-D dictionary having values as Series

  • 2-D dictionary having values as dictionary
  • Creating DataFrames from 2-D array
  • Creating a DataFrame from another DataFrame object.

  • Creating a DataFrame from a list of dictionaries.

  • Creating a DataFrame from a Text/CSV file

DataFrame Attributes:-

The information related to a DataFrame can be obtained through its attributes 

(i) df.index:- It displays the index (row labels) of the DataFrame.

(ii) df.columns:- it displays column labels of the DataFrame.

(iii) df.axes:- It returns a list representing both the axes (axes = 0 i.e index and axis = 1 i.e. columns) of the DataFrame.

(iv) df.dtypes:- It returns the dtype of data in the DataFrame

(v) df.size:- It returns an int representing the number of elements in the object.

(vi) df.shape:- It returns a tuple representing the dimensionality of the DataFrame.

(vii) df.ndim:- It returns an list representing the number of axes dimension

(viii) df.empty:- It indicates whether DataFrame is empty.

(ix) df.T:- It returns the transpose of a DataFrame by swapping its indexes and columns by using attribute T.

(x) df.values:- It returns the values of a DataFrame object in numpy array way

(xi) len(df):- It returns number of rows in a DataFrame

(xii) Getting count of non-NAN values in DataFrame:-

  • df.Count() or df.Count(0):- It counts the number of non-NAN values row wise for each colummn.
  • df.count(1):- It counts the number of non-NAN values column wise for each row.

   Deleting columns:-

  • Using del statement 
        del df[<"Column names">]

  • This will not return the DataFrame or display the DataFrame after deleting the DataFrame after deleting the columns we have to explicity print df
  • usng drop()
         df.drop([<columnname>,<columnname>], axis = 1)

  • This will display the DataFrame after deleting the column without explicity printing.
  • Multiple columns can be deleted.
Deleting rows:-

  • To delete rows 
         <DF>.drop(index or sequence of indexes)

Renaming Rows and Columns in DataFrame:-

To rename rows(index) or column names in DataFrame, we use 'rename()' method.

Ex- df.rename(index={"Eng":"English"},columns={"Name":"Sname"}, inplace=True)

Iteration in DataFrame:-

  • Sometimes, we need to process all the data values of a DataFrame. In Such Case, we need to iterate over a DataFrame
  • There are two ways to iterate over a DataFrame
     (i) Row-wise using iterrows():-


(ii) Column-wise using iteritems():-




Surya Pratap Dash

I am a student and i am passionate to learn new technologies and creating new blogs.

5 Comments

If you have any doubts, please let my know

Previous Post Next Post