Skip to main content

MLOps - Day 15

Today's learning :-
Data Visualization:-
  • I'll continue to learning data visualization, As we know there are lots of libraries we have in python for data visualization (matplotlib, seabon, folium etc.). Today I'll learn one more library called 'seaborn'.
  • Seaborn, a library of python which help us to achieve statistical graphs, but in background it use matplotlib.
  • We have do two types of operations on data, 1. Analysis - if we do data operation on past data it is known as data analysis. 2. Analytics - if we do data operation on future data it is known as data analytics.
  • There are three type of variable we have in machine learning, Uni-variate, Bi-variate and Multivariate. Graphs plotted for Uni-variate, Bi-variate and Multivariate are known as Uni-variate, Bi-variate and Multivariate distribution graphs.
  • Let's do some practicals of data visualize using seaborn library-
Code :-
import seaborn as sns
sns.set()
tips = sns.load_dataset('tips')
tips.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   total_bill  244 non-null    float64 
 1   tip         244 non-null    float64 
 2   sex         244 non-null    category
 3   smoker      244 non-null    category
 4   day         244 non-null    category
 5   time        244 non-null    category
 6   size        244 non-null    int64   
dtypes: category(4), float64(2), int64(1)
memory usage: 7.3 KB
tips.head(5)
tips.columns
Index(['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'], dtype='object')
bill = tips['total_bill']
bill.head()
0    16.99
1    10.34
2    21.01
3    23.68
4    24.59
Name: total_bill, dtype: float64
bill.max() 
50.81 
bill.min()
3.07 
sns.distplot(bill)
<matplotlib.axes._subplots.AxesSubplot at 0xe2cacc8>
#Univariate
sns.distplot(bill, kde=False, bins=50)
<matplotlib.axes._subplots.AxesSubplot at 0xe415548>
bill.head()
0    16.99
1    10.34
2    21.01
3    23.68
4    24.59
Name: total_bill, dtype: float64
# Bivariate
sns.jointplot(data=tips, x='total_bill', y='tip', kind='scatter')

<seaborn.axisgrid.JointGrid at 0xed40fc8>
Like this you can check document of seaborn  for more details and options.

After this jumped on classification in machine learning
  • If the data is not continuous and only to be decided the probability (to be happen or not) then this kind of use-case we solve using classification approach instead of regression.
  • There are two types of classification, Binary and multi-classification.
  • As mentioned it gives probability(0 or 1).
  • In Binary classification we have to set cut-off point to decided if something will happen or not.
  • To solve binary classification use cases we can use sklearn, which is known as Logistic regression, and sklearn use sigmoid function(1/1+e^-x) in background. 

Comments

Popular posts from this blog

error: db5 error(11) from dbenv->open: Resource temporarily unavailable

If rpm command is not working in your system and it is giving an error message( error: db5 error(11) from dbenv->open: Resource temporarily unavailable ). What is the root cause of this issue? How to fix this issue?   just a single command- [root@localhost rpm]# rpm --rebuilddb Detailed error message- [root@localhost rpm]# rpm -q firefox ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages index using db5 - Resource temporarily unavailable (11) error: cannot open Packages database in /var/lib/rpm ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages database in /var/lib/rpm package firefox is not installed [root@localhost rpm]# RPM manage a database in which it store all information related to packages installed in our system. /var/lib/rpm, this is directory where this information is available. [root@localhost rpm]# cd /var/lib/rpm ...

Failed to get D-Bus connection: Operation not permitted

" Failed to get D-Bus connection: Operation not permitted " - systemctl command is not working in Docker container. If systemctl command is not working in your container and giving subjected error message then simple solution of this error is, create container with -- privileged option and also provide init file full path  /usr/sbin/init [root@server109 ~]# docker container run -dit --privileged --name systemctl_not_working_centos1 centos:7 /usr/sbin/init For detailed explanation and understanding I am writing more about it, please have look below. If we have a daemon based program(httpd, sshd, jenkins, docker etc.) running inside a container and we would like to start/stop or check status of daemon inside docker then it becomes difficult for us to perform such operations , because by default systemctl and service  commands don't work inside docker. Normally we run below commands to check services status in Linux systems. [root@server109 ~]# systemctl status ...

How to check rpm integrity?

This post will help you to get answers of below questions- How to check rpm integrity? How to check rpm authenticity? How to check rpm digital signature? What is gpgcheck? Let's take an example of below rpm package and see, how to verify if it is a genuine package? [root@localhost tmp]# ls -l vsftpd-2.2.2-11.el6.x86_64.rpm -r--r--r--. 1 root root 154392 Jan 27 10:27 vsftpd-2.2.2-11.el6.x86_64.rpm [root@localhost tmp]# There are multiple way to verify. 1. Verify using rpm [root@localhost tmp]# rpm -q vsftpd package vsftpd is not installed [root@localhost tmp]# [root@localhost tmp]# rpm -K vsftpd-2.2.2-11.el6.x86_64.rpm vsftpd-2.2.2-11.el6.x86_64.rpm: RSA sha1 ((MD5) PGP) md5 NOT OK (MISSING KEYS: (MD5) PGP#fd431d51) [root@localhost tmp]# If you want to see more details then use below options [root@localhost tmp]# rpm -vvK vsftpd-2.2.2-11.el6.x86_64.rpm D: loading keyring from pubkeys in /var/lib/rpm/pubkeys/*.key D: couldn't find any keys in /var/...