Practical 2

Perform following Data Pre-processing tasks using python univariate feature selection, recursive feature elimination,  PCA ,  correlation

Feature selection: 

Feature selection enables the machine learning algorithm to train faster. It reduces the complexity of a model and makes it easier to interpret. 

Different Methods of Feature Selection:

Univariate feature selection: Using univariate statistical tests like chi-square, Univariate feature selection works by choosing the simplest characteristics. It independently tests each feature to assess the intensity of the feature 's relationship with the response variable. one among the univariate methods that eliminates about the required number of highest scoring features is Select K Best.

Principal Component Analysis : PCA  may be a dimensionality-reduction method that's often wont to reduce the dimensionality of huge data sets, by transforming an outsized set of variables into a smaller one that also contains most of the knowledge within the large set.

Correlation :Correlation may be a statistical term which in common usage refers to how close two variables are to having a linear relationship with one another .
Features with high correlation are more linearly dependent and hence have almost an equivalent effect on the variable . So, when two features have high correlation, we will drop one among the 2 features.

  1 Import Libraries:

Fig 1 import libraries


Fig 2 for feature selection encode data in one form find accuracy before feature selection

2 univariate feature selection


Fig 3 get best feature for given dataset
    From given feature we get more accuracy.

Fig 5 After feature selection accuracy

3 PCA

                                                                        Fig 6  PCA for given dataset

4 Correlation 


Fig 7 Correlation between data 



Get the code from here

Comments