Skip to main content

Posts

Showing posts from April, 2020

What is GIT and GITHUB?, How to use GIT and GITHUB?

  SCM(Source code management) is a very important area for all the developers who are working in team. If multiple developer are writing code for an application then it is very important to manage that code properly for fast delivery. There are lots of tools available in market like Mercurial SCM, Apache Subversion, Fossil, Git & GitHub .   You can think like code can be shared using emails, shared drives or somehow using file share techniques as well, then why SCM is required? Yes, you are correct if we just want to share code then these medium you can use. These are easy mediums but query comes that how you will track the code?, how you will check who has written and what has written? for this we need a code tracking technique and this kind of code tracking is known as versioning(version control). So finally we can say that using SCM we can share the code between multiple developers and also track the code and maintain versioning of it.   There are two types of SCM we have - C

MLOps Day17

Next to MLOps Day16 post :-   We have completed data cleaning process in previous post  and today we will move ahead to create a binary classification model using logistic regression approach.   Now first of all we will have to find our y(target) and X(predictors). Since we are trying to find survived or not so Survived column will be our y  and and there can be multiple X so here feature selection comes into the picture. There are multiple techniques of feature selection are available and few out of them are listed in starting posts of this series. Since I watched " Titanic" movie and heard lots of thing about titanic ship hence here I'll move with domain expert approach. Using domain expert I I find out my features and these are  Pclass,  Sex,  Age,  SibSp,  Parch,  Embarked.  A passenger is survived or not it is not at all dependent on  PassengerId, Name,  Ticket,  Fare.  That's why these are not my features. y = dataset['Survived'] y.head() 0

MLOps Day16

Binary Classification :- If you want to predict something and output of it is to be happen or not(0/1) this kind of problem solved under Binary classification. For this we use an algorithms/models is Sigmoid . To solve binary classification problems we use sklearn , sklearn call logistic regression and logistic regression internally use Sigmoid function. Hypothesis - Creating a model is also known a hypothesis. Today I am going to analysis ' Titanic ' passenger data set, and try to create a model and try predict something so that what we can do in future to avoid such casualties. Any data which has category is categorical data, doesn't matter if it contains integer or string. import pandas as pd dataset = pd.read_csv('train.csv') dataset.head() dataset.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 891 entries, 0 to 890 Data columns (total 12 columns): # Column Non-Null Count Dtype --- ------ --------