Skip to main content

MLOps - Day 15

Today's learning :-
Data Visualization:-
  • I'll continue to learning data visualization, As we know there are lots of libraries we have in python for data visualization (matplotlib, seabon, folium etc.). Today I'll learn one more library called 'seaborn'.
  • Seaborn, a library of python which help us to achieve statistical graphs, but in background it use matplotlib.
  • We have do two types of operations on data, 1. Analysis - if we do data operation on past data it is known as data analysis. 2. Analytics - if we do data operation on future data it is known as data analytics.
  • There are three type of variable we have in machine learning, Uni-variate, Bi-variate and Multivariate. Graphs plotted for Uni-variate, Bi-variate and Multivariate are known as Uni-variate, Bi-variate and Multivariate distribution graphs.
  • Let's do some practicals of data visualize using seaborn library-
Code :-
import seaborn as sns
sns.set()
tips = sns.load_dataset('tips')
tips.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   total_bill  244 non-null    float64 
 1   tip         244 non-null    float64 
 2   sex         244 non-null    category
 3   smoker      244 non-null    category
 4   day         244 non-null    category
 5   time        244 non-null    category
 6   size        244 non-null    int64   
dtypes: category(4), float64(2), int64(1)
memory usage: 7.3 KB
tips.head(5)
tips.columns
Index(['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'], dtype='object')
bill = tips['total_bill']
bill.head()
0    16.99
1    10.34
2    21.01
3    23.68
4    24.59
Name: total_bill, dtype: float64
bill.max() 
50.81 
bill.min()
3.07 
sns.distplot(bill)
<matplotlib.axes._subplots.AxesSubplot at 0xe2cacc8>
#Univariate
sns.distplot(bill, kde=False, bins=50)
<matplotlib.axes._subplots.AxesSubplot at 0xe415548>
bill.head()
0    16.99
1    10.34
2    21.01
3    23.68
4    24.59
Name: total_bill, dtype: float64
# Bivariate
sns.jointplot(data=tips, x='total_bill', y='tip', kind='scatter')

<seaborn.axisgrid.JointGrid at 0xed40fc8>
Like this you can check document of seaborn  for more details and options.

After this jumped on classification in machine learning
  • If the data is not continuous and only to be decided the probability (to be happen or not) then this kind of use-case we solve using classification approach instead of regression.
  • There are two types of classification, Binary and multi-classification.
  • As mentioned it gives probability(0 or 1).
  • In Binary classification we have to set cut-off point to decided if something will happen or not.
  • To solve binary classification use cases we can use sklearn, which is known as Logistic regression, and sklearn use sigmoid function(1/1+e^-x) in background. 

Comments

Popular posts from this blog

error: db5 error(11) from dbenv->open: Resource temporarily unavailable

If rpm command is not working in your system and it is giving an error message( error: db5 error(11) from dbenv->open: Resource temporarily unavailable ). What is the root cause of this issue? How to fix this issue?   just a single command- [root@localhost rpm]# rpm --rebuilddb Detailed error message- [root@localhost rpm]# rpm -q firefox ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages index using db5 - Resource temporarily unavailable (11) error: cannot open Packages database in /var/lib/rpm ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages database in /var/lib/rpm package firefox is not installed [root@localhost rpm]# RPM manage a database in which it store all information related to packages installed in our system. /var/lib/rpm, this is directory where this information is available. [root@localhost rpm]# cd /var/lib/rpm ...

Failed to get D-Bus connection: Operation not permitted

" Failed to get D-Bus connection: Operation not permitted " - systemctl command is not working in Docker container. If systemctl command is not working in your container and giving subjected error message then simple solution of this error is, create container with -- privileged option and also provide init file full path  /usr/sbin/init [root@server109 ~]# docker container run -dit --privileged --name systemctl_not_working_centos1 centos:7 /usr/sbin/init For detailed explanation and understanding I am writing more about it, please have look below. If we have a daemon based program(httpd, sshd, jenkins, docker etc.) running inside a container and we would like to start/stop or check status of daemon inside docker then it becomes difficult for us to perform such operations , because by default systemctl and service  commands don't work inside docker. Normally we run below commands to check services status in Linux systems. [root@server109 ~]# systemctl status ...

call to function "map" failed: the "map" function was deprecated in Terrafrom

How to change map method to tomap method? Let's say you have multiple tags in your code which was written quite back and that time it was working fine on old Terraform version before v0.12 but if the same code you execute on updated/latest Terrafrom you get subjected error while try to run Terrafrom plan command. Then this article will help you to fix your issue. What is simple solution to fix this issue? Just replace " map " method to " tomap " and just to little bit formatting for the same. Syntax:- map ({"Name", "My_Name"), map("AppName", "My_App")}) tomap ({"Name"  =   "My_Name",  "App_Name"  =   "My_App"}) or tomap ({     "Name"  =   "My_Name",     "App_Name"  =   "My_App" }) #Code with " map " method resource "aws_instance" "My_instance"   ami   =   my_ami   instance_type =   my_type   tags  =   merge(var.tag...