Skip to main content

MLOps Day16

Binary Classification :-

  • If you want to predict something and output of it is to be happen or not(0/1) this kind of problem solved under Binary classification. For this we use an algorithms/models is Sigmoid.
  • To solve binary classification problems we use sklearn, sklearn call logistic regression and logistic regression internally use Sigmoid function.
  • Hypothesis - Creating a model is also known a hypothesis. Today I am going to analysis 'Titanic' passenger data set, and try to create a model and try predict something so that what we can do in future to avoid such casualties.
  • Any data which has category is categorical data, doesn't matter if it contains integer or string.

import pandas as pd
dataset = pd.read_csv('train.csv')
dataset.head()
dataset.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB
dataset.columns
Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')
import seaborn as sns
sns.set()
sns.countplot(dataset['Survived'])
<matplotlib.axes._subplots.AxesSubplot at 0xca7c788>
sns.countplot(dataset['Survived'], hue='Sex', data=dataset)
<matplotlib.axes._subplots.AxesSubplot at 0xe6bac88>
sns.countplot(dataset['Survived'], hue='Pclass', data=dataset)
<matplotlib.axes._subplots.AxesSubplot at 0xe36aec8>
sns.heatmap(dataset.isnull(), cbar=False, yticklabels=False, cmap='viridis')
<matplotlib.axes._subplots.AxesSubplot at 0xfe72248>
age = dataset['Age']
sns.distplot(age.dropna())
<matplotlib.axes._subplots.AxesSubplot at 0xffb9d88>
sns.countplot(dataset['SibSp'], data=dataset, hue='Survived')
<matplotlib.axes._subplots.AxesSubplot at 0x100d7448>
  • If have have null values in a column and that column you want to use as a feature, because it has very much weightage then we have to feature engineering on it and this type of feature engieering is known as Imputation.
  • Imputation is the process of replacing values into substitute values.
  • We can find out mean using boxplot like below
sns.boxplot(data=dataset, y='Age',x='Pclass')
<matplotlib.axes._subplots.AxesSubplot at 0x10453f48>
def n_age(cols):
    age = cols[0]
    Pclass = cols[1]
    if pd.isnull(age):
        if Pclass == 1:
            return 38
        elif Pclass == 2:
            return 30
        elif Pclass == 3:
            return 25
        else:
            return 30
    else:
        return age
dataset['Age'] = dataset[['Age', 'Pclass']].apply(n_age,axis=1)
dataset['Age']
sns.heatmap(dataset.isnull(), cbar=False, yticklabels=False, cmap='viridis')
<matplotlib.axes._subplots.AxesSubplot at 0x10a5d448>
dataset.drop('Cabin', axis=1, inplace=True)
sns.heatmap(dataset.isnull(), cbar=False, yticklabels=False, cmap='viridis')
<matplotlib.axes._subplots.AxesSubplot at 0x10446d88>
  • We have removed all the Null values, this process is known as data cleaning.
  • Please check next post for further practical of model creation......

Comments

Popular posts from this blog

error: db5 error(11) from dbenv->open: Resource temporarily unavailable

If rpm command is not working in your system and it is giving an error message( error: db5 error(11) from dbenv->open: Resource temporarily unavailable ). What is the root cause of this issue? How to fix this issue?   just a single command- [root@localhost rpm]# rpm --rebuilddb Detailed error message- [root@localhost rpm]# rpm -q firefox ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages index using db5 - Resource temporarily unavailable (11) error: cannot open Packages database in /var/lib/rpm ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages database in /var/lib/rpm package firefox is not installed [root@localhost rpm]# RPM manage a database in which it store all information related to packages installed in our system. /var/lib/rpm, this is directory where this information is available. [root@localhost rpm]# cd /var/lib/rpm ...

Failed to get D-Bus connection: Operation not permitted

" Failed to get D-Bus connection: Operation not permitted " - systemctl command is not working in Docker container. If systemctl command is not working in your container and giving subjected error message then simple solution of this error is, create container with -- privileged option and also provide init file full path  /usr/sbin/init [root@server109 ~]# docker container run -dit --privileged --name systemctl_not_working_centos1 centos:7 /usr/sbin/init For detailed explanation and understanding I am writing more about it, please have look below. If we have a daemon based program(httpd, sshd, jenkins, docker etc.) running inside a container and we would like to start/stop or check status of daemon inside docker then it becomes difficult for us to perform such operations , because by default systemctl and service  commands don't work inside docker. Normally we run below commands to check services status in Linux systems. [root@server109 ~]# systemctl status ...

AWS cloud automation using Terraform

In this post I'll create multiple resources in AWS cloud using Terraform . Terraform is an infrastructure as code( IAC ) software which can do lots of things but it is superb in cloud automation. To use Terraform we have write code in a high-level configuration language known as Hashicorp Configuration Language , optionally we can write code in JSON as well. I'll create below service using Terraform- 1. Create the key-pair and security group which allow inbound traffic on port 80 and 22 2. Launch EC2 instance. 3. To create EC2 instance use same key and security group which created in step 1 4. Launch Volume(EBS) and mount this volume into /var/www/html directory 5. Upload index.php file and an image on GitHub repository 6. Clone GitHub repository into /var/www/html 7. Create S3 bucket, copy images from GitHub repo into it and set permission to public readable 8 Create a CloudFront use S3 bucket(which contains images) and use the CloudFront URL to update code in /var/w...