Practical 1

Data Pre-processing tasks in Python using Scikit-learn.

In simple words, pre-processing refers to the transformations applied to your data before feeding it to the algorithm. In python, scikit-learn library has a pre-built functionality under scikitlearn preprocessing.

Various data pre-processing techniques:

Encoding: One hot encoding may be a process that transforms categorical data into a kind that would tend to ML algorithms to try to to a far better prediction job. It only accepts numerical information as an input. So, by using Label Encoder, the specific data that must be encoded is transformed into a numerical form.

Standardization: Data standardization is that the method by which one or more attributes are rescaled such they need a mean of 0 and a typical deviation of 1.

Normalization: The aim of normalization is to regulate the numeric column values to a typical scale within the dataset, without distorting the variations within the value ranges.