Skip to main content

MLOps - Day 15

Today's learning :-
Data Visualization:-
  • I'll continue to learning data visualization, As we know there are lots of libraries we have in python for data visualization (matplotlib, seabon, folium etc.). Today I'll learn one more library called 'seaborn'.
  • Seaborn, a library of python which help us to achieve statistical graphs, but in background it use matplotlib.
  • We have do two types of operations on data, 1. Analysis - if we do data operation on past data it is known as data analysis. 2. Analytics - if we do data operation on future data it is known as data analytics.
  • There are three type of variable we have in machine learning, Uni-variate, Bi-variate and Multivariate. Graphs plotted for Uni-variate, Bi-variate and Multivariate are known as Uni-variate, Bi-variate and Multivariate distribution graphs.
  • Let's do some practicals of data visualize using seaborn library-
Code :-
import seaborn as sns
sns.set()
tips = sns.load_dataset('tips')
tips.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   total_bill  244 non-null    float64 
 1   tip         244 non-null    float64 
 2   sex         244 non-null    category
 3   smoker      244 non-null    category
 4   day         244 non-null    category
 5   time        244 non-null    category
 6   size        244 non-null    int64   
dtypes: category(4), float64(2), int64(1)
memory usage: 7.3 KB
tips.head(5)
tips.columns
Index(['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'], dtype='object')
bill = tips['total_bill']
bill.head()
0    16.99
1    10.34
2    21.01
3    23.68
4    24.59
Name: total_bill, dtype: float64
bill.max() 
50.81 
bill.min()
3.07 
sns.distplot(bill)
<matplotlib.axes._subplots.AxesSubplot at 0xe2cacc8>
#Univariate
sns.distplot(bill, kde=False, bins=50)
<matplotlib.axes._subplots.AxesSubplot at 0xe415548>
bill.head()
0    16.99
1    10.34
2    21.01
3    23.68
4    24.59
Name: total_bill, dtype: float64
# Bivariate
sns.jointplot(data=tips, x='total_bill', y='tip', kind='scatter')

<seaborn.axisgrid.JointGrid at 0xed40fc8>
Like this you can check document of seaborn  for more details and options.

After this jumped on classification in machine learning
  • If the data is not continuous and only to be decided the probability (to be happen or not) then this kind of use-case we solve using classification approach instead of regression.
  • There are two types of classification, Binary and multi-classification.
  • As mentioned it gives probability(0 or 1).
  • In Binary classification we have to set cut-off point to decided if something will happen or not.
  • To solve binary classification use cases we can use sklearn, which is known as Logistic regression, and sklearn use sigmoid function(1/1+e^-x) in background. 

Comments

Popular posts from this blog

error: db5 error(11) from dbenv->open: Resource temporarily unavailable

If rpm command is not working in your system and it is giving an error message( error: db5 error(11) from dbenv->open: Resource temporarily unavailable ). What is the root cause of this issue? How to fix this issue?   just a single command- [root@localhost rpm]# rpm --rebuilddb Detailed error message- [root@localhost rpm]# rpm -q firefox ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages index using db5 - Resource temporarily unavailable (11) error: cannot open Packages database in /var/lib/rpm ^Cerror: db5 error(11) from dbenv->open: Resource temporarily unavailable error: cannot open Packages database in /var/lib/rpm package firefox is not installed [root@localhost rpm]# RPM manage a database in which it store all information related to packages installed in our system. /var/lib/rpm, this is directory where this information is available. [root@localhost rpm]# cd /var/lib/rpm ...

Failed to get D-Bus connection: Operation not permitted

" Failed to get D-Bus connection: Operation not permitted " - systemctl command is not working in Docker container. If systemctl command is not working in your container and giving subjected error message then simple solution of this error is, create container with -- privileged option and also provide init file full path  /usr/sbin/init [root@server109 ~]# docker container run -dit --privileged --name systemctl_not_working_centos1 centos:7 /usr/sbin/init For detailed explanation and understanding I am writing more about it, please have look below. If we have a daemon based program(httpd, sshd, jenkins, docker etc.) running inside a container and we would like to start/stop or check status of daemon inside docker then it becomes difficult for us to perform such operations , because by default systemctl and service  commands don't work inside docker. Normally we run below commands to check services status in Linux systems. [root@server109 ~]# systemctl status ...

AWS cloud automation using Terraform

In this post I'll create multiple resources in AWS cloud using Terraform . Terraform is an infrastructure as code( IAC ) software which can do lots of things but it is superb in cloud automation. To use Terraform we have write code in a high-level configuration language known as Hashicorp Configuration Language , optionally we can write code in JSON as well. I'll create below service using Terraform- 1. Create the key-pair and security group which allow inbound traffic on port 80 and 22 2. Launch EC2 instance. 3. To create EC2 instance use same key and security group which created in step 1 4. Launch Volume(EBS) and mount this volume into /var/www/html directory 5. Upload index.php file and an image on GitHub repository 6. Clone GitHub repository into /var/www/html 7. Create S3 bucket, copy images from GitHub repo into it and set permission to public readable 8 Create a CloudFront use S3 bucket(which contains images) and use the CloudFront URL to update code in /var/w...