Power BI Demos Part3 - Sep 4th 2024 | Bhaskar Jogi | Go Online Trainings | 90000 75637

Published: Sep 03, 2024 Duration: 01:24:17 Category: Education

Trending searches: demos
good morning friends welcome to goonline training.com so today is the final demo of data analytics using power ba demo part three of three people who missed last two demos I strongly recommend you must watch those demos if you don't have those demos classes means just send me your details to basar gmail.com saying that I'm the new student today I need those details okay mention your phone number your email address and location send me a separate mail or you can also WhatsApp your details to these numbers and today is the final demo from tomorrow this meeting ID is going to change and tomorrow we're going to be installing softwares in your computer right you need to have computer or laptop without that you will not be able to do data analytics course so what is the configuration that you need to buy so let me show you that laptop configuration so I'm going to be using Lenovo laptops you can buy any laptop that is fine so the laptop should minimum uh it it should have uh minimum you know 4 GB of RAM so minimum I'm talking about so it is going to be giving you 8GB of ram 512gb hard disk and Windows 10 or 11 is needed and we are going to be using uh this laptop the cost is 33 right so don't go for very costly laptops 30 to 35,000 R laptops you will be able to get it right and uh yeah the we are going to be giving you daily recorded videos and these recorded videos will be there for 6 months with you right so after that if you want to extend for some time that is also possible and six months you will be able to extend by paying some nominal amount some 300 rupees or 400 rupees per month or you know per quarter you can just make uh you know payment and continue for another you know six months clear so this kind of a laptop you can buy it so minimum 4 GB of RAM and uh 52 GB hard disk and windows 11 is mandatory and don't go for very costly laptops and don't buy these seller on laptops these are completely useless and very cheap but we cannot run our softwares on the S on laptops am I clear so don't buy this in 21 22,000 minimum 30 35,000 laptops you need to buy and you can use it okay so that is the configuration I'm going to be talking about now so today let us talk about the final demo yesterday I was talking about uh uh let me let me just you know open this so I have given you the complete uh instructions about what is IT industry and what is a data analytics and what is a data science data analytics means understanding the past and data science or artificial intelligence means predict the future or it is also called as a artificial intelligence machine learning and deep learning all of them are synonyms clear so data analytics can tell you what has happened already so what was my sales in India last three years Amazon company wanted to know so this is the best example for data analytics or business intelligence what is happened clear what will happen that is what is called AI is going to be giving you AI is going to be doing the predictions in the future what will be my sales in India next 3 years what will be my death rate in India next 5 years because of obesity rate is increasing you know the companies can predict how many people can become a diabetics next one year how many people will become you know uh Cardiology you know patients next year cardiac arest people next year like that anybody can easily predict by taking the historical data of the company right so yesterday I was talking about when you have the data what is the size of of the data when you have a small amount of data the data you will be able to keep them in Excel when you have a large amount of data I'm talking about me gigabyte of data this kind of a data can be stored in this kind of a software in IT industry and if you wanted to process the data we're going to be using a language where you need to write so much of a coding right but if you have terabytes of data we're going to be using different tools called Data warehousing tools and to process this much SE a data so we have ba tools got introduced language is completely code based so people are hating a code so slowly IT industry introduced no coding or low coding tools on year 2000 onwards clear so we have ba tools been introduced in year 2000 we have SAS ba sap ba Microsoft ba TBA now we have power ba all the things I have given in yesterday's class clearly you can just watch this video in my basar Joi YouTube channel as well and uh all the videos are already uploaded so guys if you have big amount of data so if you have terabytes of data we can go in for data warehousing but if you have a PAB zaby UT of data it is called as a big data so this is a trillions of data I'm talking about the dat data you will be able to store them in a separate computer called Hadoop so Hadoop is implemented or you know invented by Apache software Foundation you know Hadoop is a free software right all these are paid softwares so all these are paid softwares even uh my SQL is a free software right even Excel is also a paid software yeah am I clear so if you have terabytes of data data warehousing can deal with it but if you have big data unlimited data this concept Big Data you know keyword introduced by Google company in 2003 and if you wanted to store large amount of data or if you wanted to process large amount of data we can use Apaches and Hadoop and this is implemented in year 2006 by apachi software Foundation this is a free software to store large amount of data and to process large amount of data so in year 2006 it got released now it has lot of issues if you wanted to work with large amount of data again you need to uh you know understand Java here without java you will not be able to work with Apachi Hado again lot of coding and if you want to store the data ado gr is going to be storing the data in a different uh concept called hdfs to process the data it is going to be giving you One More Concept called map reduce we don't understand all these things because all of them are Java based right if you have a large amount of data Hadoop is going to be storing uh using a concept called hdfs Hadoop distributed file system you should not be worried about it because we are not going to be understanding what is Hadoop and everything and if you wanted to process data Java language is going to be used with the help of a Java language it is implemented a concept called map reduce so with the help of that people are processing the data but Hadoop is storing all the data into the hard disk okay all these things are not needed but I'm just telling you it is going to be store all the data into the disk because of the dis Hado page is slow right to overcome this issue first issue Java Java means so much of a coding people will hate coding and so much of a data is going to be stored in the hard disk of a Hadoop computer right because of that it is going to be very very slow to overcome these issues you know same company which which Foundation I'm talking about Apachi software it is not a company which is a foundation free charitable organization it is going to be giving you softwares for free free free right so apachi Spark has been introduced in year 2009 so we are not going to be using you know uh uh Java so it is going to be giving one more language called Scala which is very very easy language right and people asked so you Scala is f that is new language everybody is learning python so we are planning to use Python they said yes this is also you can use and Scala also you can use and these are easy or so if you have a data compulsorily you should know SQL without SQL you will not be able to work with this and the spark is going to be storing all the data into the ram this is uh because of this Apachi spark is 100 times faster clear so that is one more and things you need to understand 100 times it is going to be you know faster because this software is going to be storing your data into the RAM and this software is going to be storing your data into the hard disk because if I store something into the ram I am very very fast how how many times faster it is a double the time of this machines because of this things Apachi Hadoop has gone so today Apachi spark has been introduced into the market from 2009 onwards am I clear so what is the language we're going to be using Scala you can use Python you can use SQL you can use or you can use R language lot of any language that you wanted to use you will be able to use even Java also if you wanted to use you will be able to use Java net any language that you will be able to use it but these are very very complex people may use Scala very easy or a Python language very very easy to process the data and it is going to be storing all the data into the ram that is the reason it is 100 times faster Java is completely avoided but if you want to use Java that is also fine clear it can support lot of languages and it is 100 times faster when you have a pabit zabate yab of data big amount of data can be stored today in the Apache spark clear so that is what you need to understand and you can have either on premises computer setups or you can also have the cloud cloud also you can use to store your Apache spark data clear what is a cloud what is on premises I will be able to talk about it and if you want to store the data so these are the things that we have clear so all these things I have clearly explained and if you have any questions you will be able to ask me now clear if you have data you may store the data in Excel you may store the data in SQL you may store the data in data warehousing at the end you're you're going to be storing the data in your you know Hadoop systems clear so that is what is the point here I'm going to be talking about so if you have small amount of data Excel can be used small amount of data I'm going to be talking about but if you have a medium size data media means I'm going to be talking about GBS of data every month so this is a is medium size data I'm going to be talking about let us say uh GBS of data right so that is the point and which one you're going to be using SQL you can use so small amount of data Excel can be used and here I'm going to be talking about small KBS or MBS of data this is what I have given yesterday right and if you have GBS of data we are going to be using this right it is a medium siiz data if you have large size data so we are going to be using specialized computers are called as data warehouse computers so today let us talk about all of them one by one so if you have extra large data I have Excel size data or double XEL size or triple XEL size or you can have any extra large data triple XEL or any kind of excel that's it unlimited data how do you store so today we are going to be storing them in the Apache spk earlier we were using Apachi Hadoop clear so these are the data storing Technologies you know it indust is going to be having so if you go and buy a laptop in a Amazon company if you go and do some transactions in ic bank or State Bank of India or you're going to be using phone pay application end of the day you're going to be giving some data to the phone pay right all the data of the phone pay will be stored in these applications say there may be Excel Excel will be small amount of data phone pay data will be very very big amount of data that so remaining things can handle either SQL or a data warehousing or Apaches spark so this is where the IT industry is going to be storing the data these are storage devices that's it you don't have anything else in the IT industry only these are the things to store your data so after storing the data what normally we do we are going to be doing two kind of an applications right one is called ba the other one is called AI right what are the applications that we're going to be using one is called AI artificial intelligence the second one is called bi business intelligence C so artificial intelligence can take the data and it is going to be predicting the data AI means what AA is going to be doing the predictions remember a is going to be doing the predictions it will take the data from Excel or maybe from SQL or maybe from data warehousing or maybe from Apachi spark or Apachi Hardo it will take large amount of data it is going to be predicting right if you want it to predict it's going to be using a lot of algorithms what are the algorithms one is called clustering one is called it may use clustering or it may use classification algorithm you don't understand all of them but don't worry about it clear don't worry about it so you may use a clustering algorithm or a classification it is going to be using one more algorithm called regression clear so AI is going to be using your data and it is going to be predicting predicting means what will be my sales in next five years in India what you know will be you know uh my death rate in India next five years the government of India wanted to predict something clear if I give past data historical data we need to give it to AI so that it will take historical data it will be able to predict something with the help of a three algorithms the clustering classifications and regressions clear so that is what is called AA but what is ba bi is also called as business intelligence is also called as a data analytics right data analytics means understanding the past so it is going to be giving a report in the past report past is called as you know business intelligence or data analytics it can tell you what was the sales in India last five years or what was the obesity rate in India last 10 years and what would be obesity India next five years I would like to predict so prediction can be done with the help of AI and understanding the past is called data analytics and data uh you know business intelligence so if you wanted to work with either AI or bi you have to have the data so this data is going to be collected and cleaned and stored by one more guy the guy is called as data engineer clear so if you wanted to work with AI or if you wanted to work with the ba these guys will be helping you these guys are called Data Engineers remember clear so data Engineers is going to be getting the data of the company cleaning the data of the company storing the data of the company in different different platforms maybe SQL Excel we don't use it because so much of data we cannot store it either SQL or a data warehousing or Apaches Park 90% of the time you're going to be you know you're storing the data into the apachi spark only clear so data engineering is the guy who's going to be working with all these Technologies either is going to be working with Apaches spark or data warehousing or is going to be working with the most of the time with the SQL clear if you want to become a data engineering you have to learn data warehousing SQL and Apaches spark Big Data Hadoop and everything plus you need to know python you need to know Python language so that you can become a data engineer data engineer is also called as a uh data preparation guy if you wanted to prepare a Biryani or if you want to prepare a pizza for thousand people what all you need you need to get raw material right you need to get rice if you wanted to prepare a Biryani so thousand people means at least uh one month before you have to go to the market sample all the rice and you need to buy lot of spices right and before buying the spices you need to choose the spices right and get the spices you know choose the rice get the rice sample it if the rice is good get the rice and after that what you do you're going to be cleaning it cleaning the rice spices you need to dry them in the sun right so after that what you do you clean the spices after that you need to grind them after that what you do you have to go and get a lot of vegetables at least one or two days before and before that you need to choose from which vendor I need to choose go to shop and buy a lot of vegetables and after that what you do you need to clean them after cleaning them what you you need to cut them and you need to get the chicken and get the proper chicken sample it and if the chicken is good then you can order them from the vendor get the chicken cut them so all this process will be you're going to be starting not one day before at least 105 days before you need to start so that you will be able to prepare a good Biryani tomorrow clear so this is what is called food preparation if you want to prepare Biryani for thousand people tomorrow so tomorrow morning you cannot go to shop and get the rice bags and you cannot get the vegetables you cannot get the spices within one hour you cannot do that clear if you wanted to work with artificial intelligence or business intelligence you cannot get the data within one hour and you will not be able to perform the artificial intelligence or the data analytics Works within one hour so what we need to do we need to go get the data of the company clean the data of the company you have to cut the data marinate the data join the data if you have a lot of issues in the rice you need to clean them and if the tomato is not so good you need to cut them and you know you have coriander that you're going to be using that you need to clean them cut them and you need to soak them in the salty water some lot lot of leafy vegetables you're trying to put you need to bring them and clean them at least one day before you need to soak them in the salty water right so all these are called you know food preparation right so data engineering is the guy who's going to be getting your data cleaning the data storing the data he has to do his this work before one month or two months or before one year or two years daily getting the data cleaning the data storing the data lot of activities are going to be happening so data engineering will take a lot of time clear so once everything is ready one food preparation is ready tomorrow morning who's going to come Chef will come morning 7:00 he switch on the stove within 3 hours your Biryani is ready your food is ready but to prepair three hours of food so data engineering team will be getting the data from you know complete whole year they will be collecting the data cleaning the data dumping the data and removing lot of duplicates from the data lot of circus that is what is called a data preparation clear so data preparation is very very important topic as part of your AI Orba artificial intelligent people or business diligence people cannot prepare predictions or the past reports if you don't Supply the cleaned data clear so Chef cannot prepare the Thousand you know people Biryani if you don't provide them you know good rice or good vegetables good chicken so even though he's very good Chef he has very good you know reputation if you don't Supply raw material cleanly then the food will be completely it will become bad food clear artificial intelligence can work properly our business intelligence can work properly when you supply proper data to them who's going to be supplying data engineering team clear so that is what you got to understand now we are talking about artificial intelligence right what is the meaning of artificial intelligence artificial intelligence means make the machine intelligent do you think that computer is intelligent do you think that you are intelligent no right when you join any course you'll become intelligent you do some lot of courses or you are in the some environment suddenly you'll become intelligent guy so by default we are not intelligent like that mission is also not intelligent clear so what is artificial intelligence make the machine intelligent that is what is called artificial intelligence mission is not intelligent but somehow you do 50 things to the do the mission so that mission will become intelligent you have a boy you do something so that he he can become an intelligent guy you coach him you guide them you have the proper mentors to them you do lot of you know things to the boy or the girl the girl can become intelligent girl or the boy can become an intelligent boy clear so machines are by default computers machines I'm talking about computers not intelligent but something you need to provide with the help of that it will be able to you know you it will become intelligent right so those intelligent machines you can use and you can do lot of activities that is what is called machine learning or artificial intelligence means make the machines intelligent so how do you do that there are there are thousands of things you can do but here we're going to be using some algorithms one of the algorithms called regression or a clustering or a classification with the help of these algorithms with the help of these algorithms you can make machines intelligent clear so you can do 10,000 Things but here end of the day the machine learning you know the team or the artificial intelligence team is given you this kind of algorithm so with the help of these algorithms it will take the data it will predict the data it can give you beautiful predictions and missions also will become intelligent missions in the world so use those missions for accomplishing your activities that is what is called artificial intelligence today C so now we have generative AI what is generative AI that is you know chart GPT chart GPT is AI application so I'm talking about chart GPT it is one AI application right so let me just go I wanted to have a chart GPT here right chart GPD is going to be it is not AI application it it is a gen AI application normally AI applications are going to be predicting the data now a a applications have to get the data from most of the data will be here only remember this so if the AI application supposed to get the data from here so chart GPD is also getting the data from where it is going to be getting the data from your you know Kachi spark and apach this one most of the time or you can also get the data from here but most data will be coming from where most of the time a applications are reading the data from apachi spark or apachi Hadoop only so chart GPT is a gen AI application normal AI is going to be predicting the data so this is what is called prediction but generative AA is not going to predict it is going to be generating what is it is going to be generating it may generate a text or it may generate a audio or it may generate video or it may generate an image so which is going to be a subset of AI jot GPT is a gen AA application which is not going to be predicting instead of that it is going to be generating you ask something it will give you the answer it is going to be generating a text or generating a audio generating a video or generating a image so it is a type of AI application normally AI applications are going to be using predictions it is going to be predicting but geni application chart GPT is not going to be predicting chart GPT app application is going to be generating generating what Gena means generative means it's going to be generating producing giving you some information what it is going to be giving you either you ask some question it will give you some text either in the text format audio format video format or you ask for image it will be able to tell you it will be able to give you clear so AI initially uh using what is called as a machine learning algorithm so a internally what algorithm it is going to be using machine learning algorithm it was using earlier so the machine learning algorithm can use only small amount of data when you have a large amount of data machine learning algorithms will fail so to overcome this when you have a large amount of data machine learning will fail now we are going to be using deep learning algorithms what is the algorithm I'm talking about deep learning algorithm when you have a large amount of data we are not going to be using uh machine learning deep learning is going to be used for example when you have the data in the Apache spark if you want to predict something so machine learning cannot be used there we are going to be using a deep learning algorithm so inside that people have used large language models natural language processing lot of things natural language processing with the help of these models your chge GPT application is being prepared clear so chart GPT application is AI application and which is going to be using deep learning algorithm because large data set is going to be taking and using a lot of models like llm large language model natural language processing techniques being used it's going to be predicting it's not going to be predicting it's going to be generating lot of things so who's going to be supplying the data to the Chart GPD or co-pilot this is what is copilot it is installed in my windows you know application Windows laptop right co-pilot is also similar to chart GPT all of them are getting your data from the Apachi spark or Apachi Hardo so who's going to be storing the data data Engineers team yeah I clear so this is what you got to understand so so today we can either perform data analytics and a ml data analytics course you can complete within three months that is what I told you right so data analytics course you can complete within 3 months data engineering course also you can complete within 3 months but data science or artificial intelligence if you wanted to learn you need to have minimum 18 months time so that is the story here right you need to have this much of a time to work with the AI and ML and deep learning clear so don't jump into AI applications AI means make the machines intelligent use those machines for your daily activities very simple clear you do 10,000 things to make the machines intelligent one of the algorithm is called Ai and ML and deep learning algorithms clear end of the day you're going to be predicting with the help of a clustering classification and regression you don't have anything else these are the uh you know algorithms or you can say steps that will be taken on the data so that you will be able to predict your data it is over view of AA and ba and data engineering yeah am I clear so let me go and let me uh try to talk about it so today maximum data is going to be stored where Apache spark and Apache Ado adup is gone why Hardo is gone when you have a large amount of data when you store it here it will become very very slow so that is the reason many companies are going to be storing your data in the Apaches Park chart GPT is getting data from where Apaches Park it is a bigger machine these machines are completely managed by the bigger companies like Google company or Microsoft company like Facebook insta all the bigger bigger companies are going to be maintaining these you know Apache spark machines am I clear so you need to understand that now today I wanted to tell you what is a data analytics right we are not going to be talking about will be we not be working about Ai and anything so we are going to be only working with business intelligence data analytics now I will give you one use case Amazon company wanted to analyze their data what is the data they have daily retail data retail sales data they have right so you're a data analyst you join an Amazon company as a data analyst now Amazon company will tell you hey basar you have to take my data and you need to analyze my data now you have learned you have learned lot of things you know SQL you have learned power ba everything you have learned now you are ready to do data analytics for the Amazon company with the retail sales data now you ask sir give me your data I am ready to create a beautiful reports that is what you told now you're asking sir give me the data they have a lot of data challenges now so what are the issues with their data they're telling hey basar you're asking the data but all the data is not available in a single location location is the first challenge I have some data in us some data in India some data in UK some data in Japan I have some data in Dubai like that I don't have the data in a single place because we are running the the business where doing the business all over the world I have the data decentralized that is the first issue second issue hey basar I don't have the data in a single environment in India in us we are using scap in India we are using Java in UK we are running all the application using python in uh in Japan I'm going to be storing my data in Excel and Dubai we have a call center call center data is available in Excel clear all the data belong to single company but they don't have the data in single location they don't have the data in a single environment Java applications running Android applications are running you can go to website and book the you know products you can go to Android application or iOS application you can open your Amazon app and book the products so they are capturing the data in the wide varieties of environments or the softwares they are using this is also one more issue that you need to understand the data is not so uh centralized it is decentralized completely the second issue you need to understand environment also different different environments maybe notepads data notepads are called as text files or these are also called as a flat file now what about the size of the data so they say I have very very large amount of data large means very very large or I'm using medium size of data in some companies I'm going to be having very small GBS of data Maybe terabytes of data PAB zabte of data also have in some places so the data size is also not similar what about cleaning data accuracy of the data so when you ask them sometimes it is very very clean data high accuracy some places it is not at all very clean at least some 20 30% of the data is uncleaned data so in some places the data is completely bad it is not at all cleaned 40% of the data will be duplicated data lot of blank values lot of unmatched data lot of misplaced data lot of missing values a lot of miss you know spelling mistakes data also we have it so accuracy I cannot guarantee you even though it is my data I cannot guarantee it is a cleaned data right so you go to uh you know shop you will be able to buy potato or a tomato or some mushroom do you guarantee that it is cleaned potato cleaned tomato it is not guaranteed you have to bring the potato clean them if the potato is not so good you need to peel them or cut half of the potato and throw them so even though you go to the shop you choose some potato bring the potato still those vegetables are not maybe cleaned right so when you somebody's giving tons and tons of potato to the potato chips factory do you think that all potatoes cleaned no no no no right some of them some from some area the potato is very very cleaned potatoes you will get it but from other area you will not be able to get very very cleaned potato 40% of the potato will be not very very cleaned clear so potato Factory chips factory has to get the potato clean the potato peel the potato throw the potato if the size is very very small it has to remove that right lot of circus The Potato Factory is going to be doing in order to make the proper chips clear like that you cannot take the you know data as it is you cannot dump it as it is you need to clean them who's going to be doing all of them get the data from the different sources and you need to work with the different sizes of the data different size of the potato right you need to clean them you need to put them in the water soak them put lot of salty water termeric and lot of chemicals somehow you need to clean them right lot of process it has to be done so that you will get a cleaned chips tomorrow clear so data is also you cannot guarantee that it is a cleaned data so what you need to do we have to bring them like that lot of issues with the data also I have given you four issues you have to go and get the data and try to use the data so what I'm trying to use now I'm trying to get the data because all the data belongs to Amazon company I go to this I get the data from this place I have to get the data from this place like that they may have some hundreds of sources from where I will be able to get the data so all this data I need to get it from here get it from here get it from here so this may be you're using sap in us all the data you're going to be using from us in us uh people are going going to be using sap computers in India we don't know it is Java right when you go to amazon.com it will be Java or you may be K in Dubai right they are using Excel sheets and whenever you call to the call center saying that my product is not so good the data may be stored in Excel so when you go to amazon.com so do you know what is this amazon.com site prepared do you know it may be Java application it may be python application it may be angularjs application or it may be X application y application I don't know what is a technology they have used I don't know but I go I wanted to buy a laptop whenever I'm choosing the laptop and trying the laptop so what will happen I go and I click on this cut and I wanted to add to the card or go to the card when I go when I'm trying to buy this it's going to be taking my card details or the net banking details it's going to be storing all the data into this so this is my computer I would like to buy a lone laptop using my computer I have established how do I established I'm going to be uh connecting using a application what what application that is so that you do not know it may be a Java application so you buy some product when you buy a product some data will go and it will stay there like that people are buying and buying buying and buying so lot of data will come it will get stored clear so these computers are called as so data is here daily you're going to be storing all the data these computers are called as oltp computers clear so this is where you are going to be storing your data whenever you purchase any product your data is going to be going here in India it is going to be stored in the Java computer it is going to be storing your data somewhere so it may be Java application or maybe do application we don't know it is going to be storing somewhere your data clear so like that sap is also going to store your data somewhere Excel is also storing the data in Excel all the data is available so what we need to do all the data of Amazon company you have to extract you need to get the data so for that purpose we're going to be using it ETL tools extracting the data transforming the data loading the data clear so what is ETL stands for extract extract means read the data transform transform means load or clean the data clean or change so to clean the data or change the data the last one is called you need to load the data so we have a tool the tool is called already I was talking about one tool in we are talking about msba the other day do you remember msbi is a uh Microsoft you know ba tool before power ba msba has three tools that is what I told you one is called ssis second one is called ssas and the third one is called SSRS now I'm going to be using this tool this is what is tool this is a this tool is called ETL tool so this is SQL Server you know uh uh integration Services which is ETL tool clear so Microsoft has given you a tool called msba This is given you in 2005 clear so people were using three tools one tool is called ETL and second one is called analysis the third one is called SSRS it will create a reports to overcome lot of people are stopped using SSRS that is the reason Microsoft launched power ba in 2015 clear because the SSRS tool is completely a it it people can use the can use this tool to create a reports but Power ba is a self service ba what is the meaning any non-technical people can create a report but to create a report you need to have the data so the data you can collect with the help of ssis so all the data you have to get from different locations all the data you're going to be keeping them in a centralized server this is a computer bigger computer this computer where you are going to be doing data cleansing work data cleaning so what all you do you're going to be removing duplicates if you have a duplicate data or you can correct misspelled data if you have spelling mistake data that you need to correct it misspelled data or whatever you wanted to do you will be able to do that or you may be converting the data or you may be merging the data Maybe splitting the data lot of activities you do right so all these are called as your data cleansing activity so this computer is called as a staging computer staging server staging means temporary staging means what temporary what software we're going to be running here any SQL software that you can use it here clear so all all the data from sap from java from Excel from net from SQL lot of places data is available in this machines and all the data you need to extract it you need to clean the data store the data into this machine clear this machine is called as a staging server staging computer it is going to be large hardware and SQL will be installed in this clear you put the data you're going to be completely cleaning the data I will give you simple example you wanted to buy a lot of vegetables what you do you go to the shop number one this is Shop number one buy some vegetables you go to shop number two buy buy some more vegetables you go to shop number three buy some more vegetables chicken milk and everything right after that where will you keep the data where will you keep all the items you're going to be keeping them in the one temporary stor is the temporary stor is called this this is what is called your carry bag right you can maintain a carry bag right so when you go to the shop what you do you're going to storing all the items you're going to be putting them into the carry bag so this is what is your carry bag temporary storage okay after coming to home what all you do you're going to be cleaning the data at the source level but still you come home you open the bag you're going to be again recleaning tomato you put them in the salty water potato you peel them and leafy vegetables you put them in the salty water and lot of things are still again you're going to be cleaning clear at the shop level also you go and choose proper potato one one type of the filtering is happening one type of cleaning is happening there at the time of getting the data you clean but does not mean that it is completely cleaned you go to the staging staging means all the vegetables you're going to be keeping them in the uh your bag and you come home you open the bag and again try to clean them so second time I'm going to be cleaning so with the help of a staging area staging area is nothing but your carry bag you open the bag you you know get all the items and you clean them second time clear staging area means temporary carry bag is temporary right will you be using permanently no no no no so once the vegetables are cleaned means all the vegetables will go to where all of them you're going to be taking from here from this machine with the help of again with the help of we are going to be using again ETL ETL is going to be used so so with the help of again ssis I'm going to be taking this data these are cleaned vegetables I have so all the vegetables I'm going to be keeping them into refrigerator this is my bigger fridge refrigerator in the refrigerator all the cleaned vegetables I'm going to be keeping so this is your fridge this is your fridge and this is what is called as your data bearhouse so fridge will be very bigger size bigger than you your bag right data warehouse means I told you large Hardware plus SQL software you're going to be installing that is what I told you so what we're going to be doing daily you go to the shop number one shop number two Shop number three you get the you know vegetables you clean them you put them in the carry bag you come home open the carry bag you clean them again all the cleaned vegetables you're going to be packing them and keeping them into the data warehousing data warehouse your will be 200 l or 500 L fridge depends upon the vegetables that you have it is a very very big carry back you're not going to be putting permanently right carry bag will become empty because all the vegetables when you take it it will become empty again tomorrow you can take the same carry bag you will go get the vegetables in the same carry bag you clean them and clean the vegetables again you are going to be keeping them in the refrigerator so that is what you need to understand data warehouse means cleaned data cleaned vegetables clear that's what we're going to be in storing in the data warehousing machine so data warehouse machine is called as a olap machine so what is this is called as it is called as a olap machine I will take another 10 15 minutes to complete the demo today is the final demo guys so from tomorrow this meeting ID is going to change you will get the payment links and everything by 12:00 today you can just make the payment and you can also make install ments and at least take some mment and try to you know get the new meeting ID and tomorrow morning you will be able to connect with the new meeting ID and tomorrow we are going to be installing SQL and power B and everything clear so data warehousing is a large machine and where you're going to be getting the data I will give you simple example so what data warehous in the real time so daily you go and you're going to be buying a lot of product all your data may be going to Java or sap or maybe SQL or Excel we don't know right all the data is daily you're going to be buying and getting but one day these machines also will be full right with the data you may be buying very very bigger machines here but this machines also will be full after one month or after 6 months or after one year or after 3 years daily you people are buying and buying buying and buying this data will be full these servers will be full in these machines right these machines are oltp machines these machines are olap machines let us understand what is the meaning of that right so I'm going to be giving you two more things very very important things one is called oltp environment it is called as online transaction processing environment right I'm talking about what online transaction processing that is what is called olp second environment I'm going to be talking about in the IT industry that is called olap environment that is online analytical processing analytical processing so oltp means it could be any server where you have a fresh data for example you go to amazon.com you buy a product that is a fresh data you go to banking uh you go to State Bank of India you withdraw the cash fresh transaction right you go to phone pay application you transfer funds to my account today because you need to pay the amount that is a fresh transaction right so phone pay is a olp application your banking application is a oltp application banking may may be using Java software python softare Android software or maybe SQL software we don't know how the data is going to be stored we don't know but wherever you're dealing that application is going to be staring the uh taking the fresh data from you so wherever there is a fresh data those applications are called as oltp oltp means online transaction processing transaction means what anything you do you withdraw the cash you deposit the cash you make an inquiry you cancel the booking you you return the product in amazon.com you book a product you book a hotel room you cancel the food whatever you do uh whatever you do in the real time every application is going to be capturing the data clear the data will be fresh data wherever there is a fresh data that application is called AS oltp application it may be sap application Java application SQL application Android application X application y application doesn't matter so all users are going to be using oltp applications only now phone pay is olp why press transaction I'm doing it I go to uh bookmy show.com I'm going to be buying a lot of tickets for the latest KY movie right or tangalan movie I'm going to be buying a lot of movie tickets today to my family's fresh transaction right so all the fresh data is going to be stored in the bookm show.com maybe using some lot of you know softwares that we are not worried about it but book my show is a oltp application oltp applications are going to be used by the end user you are the end user I'm the end user you go to Walmart or you go to dmart you buy a product a bill will be generated that is a fresh bill right so the data is going to be stored in a dart computer maybe Java maybe so Dart application is a oltp banking appli ation olp hotel room booking oltp AEL where you go you recharge today it is a fresh transaction so that is also olp so whatever you using whatever I am using completely oltp oltp olp clear oltp means transa oltp means it is going to be storing the business data clear business data means fresh data so what you need to understand oltp means fresher data and small amount of data in this we're going to be storing maybe uh per month I'm going to be getting 500 GB of data in the machines I don't have very very bigger computers in this very uh small computers will be olp computers very very small computers small data only we're going to be storing it here clear so that is what is the point okay so I need to understand the second one is called it will be read and write data 80% of the time people are buying the products right so only 20% of the time you're going to be reading what is my balance in the bank but most of the time you're going to be withdrawing the cash transferring the cash so in amazon.com only you're reading the products or buying if you don't buy it amazon.com cannot get you Pro you know profits right so wtp data fresh data r right data 80% right data and small amount of data you read and it will be used by the end users clear end users are going to be using it and it is going to be to run your business business data is going to be captured daily data to run the business these applications are going to be used but now tell me every day you're going to be buying and buying and buying and buying the product these computers are full one day right this computer may be full one day it may be full one day it may be full one day you're going to be buying and buying buying and buying so what you do all the data you get it you put all the data here you clean all the data because it come it is belongs to single company called Amazon company all the data I can clean it cleaned data will be going where into the data warehousing it is a bigger machine all the data will go so again you come back here all the data you're going to be deleting because this data is historical data last one year data it is there now I have pushed the data clean the data it is gone this is also empty now again data will be coming and coming coming and coming next month the next year data also will come it will become full these computers will become full what I do I take all the data I'm going to be keeping it here I'll clean all the data then all the cleaned data last year data also you are going to be putting it here right once you put the data here this data you're going to be deleting from here now these buckets will become empty clear like that fresh data will come here when these computers are full you're going to be taking the data from here with the help of uh uh some softwares are going to be taking the data cleaning the data and dumping the data into the bigger data warehouse clear so data warehouse is going to be having old data not daily data so it is going to be not a fresh data one year data you accumulate you store it it is historical data historical data every month you're going to be getting 500 GB of data into 12 months of data it will become 6,000 GBS of data is equivalent to 6 terabytes but I'm talking about 10 years it will become uh 6 into 10 it will become 60 terabytes of data I want to store so every month you're getting the data all the data will become a 60 terab it will become large data right so it is historical data it is a read only data so old data nobody's going to be modifying so who's going to be using this Data Business users are going to be using this data and it is to analyze your business to analyze your business we're going to be using I will explain all last two points CLE so we are going to be understanding all the data you will be able to get it you going to be keeping it here all the historical data of the company will come here I will give you simple example so that you can understand the completely the entire picture what I'm trying to do I don't have a water uh available in my home if I don't have the water government of India supplying the water morning 6:00 water comes to my street I will take a bucket and go so this this this is a bucket you can think this is one bucket I will take and go and this is one more bucket my uh my mother my father all the people are going to be going and collecting the water from the street because morning 6:00 water comes there is a pipeline in the street morning 5:00 or 6:00 me my mother my brother my sister everybody will go because we don't have water so what you do you take a bucket you switch on the tap and the water comes so after 5 minutes this bucket will be full which bucket I'm talking about this bucket will be full right you go this bucket will be full within five minutes even my mother's uh bucket is also full and this is my bucket is also full if the bucket is full what you do you have to maintain a drum or the tank so so this is the drum or the tank you come you will be able to get the water here you will go back with the empty bucket this is the empty bucket right this is empty bucket again fresh water you're going to be capturing the next level the water again wait for five minutes these buckets are full again this water going to be dumping it here and you will come back with the empty bucket this is empty bucket now what is your o TP oil TP means small in size what is a bucket size 30 40 L but why are you using to collect the fresh data fresh water when the bucket is full what you do you have you have to maintain a tank or a drum at the home you go you're dumping the water fresh water into the tank or the drum coming back with the blank bucket again you're capturing the fresh data clear but why can't you take a drum and go to the street it will be very very heavy you cannot lift it you can take the drum and go to the street so we never ever use data warehousing to start capturing your fresh data clear so whenever the freshers are whenever the customers are buying the product o TP computers are small in size less than 1 terab maybe your um you know laptop cap capabilities something like that these are your buckets you can think like that bucket is small in size and it can hold small data but fresh data but data warehousing is a drum it is going to be having 1,000 L water and last days water one month water also you can have in the data warehousing so data warehousing is going to be large data going to be storing and read only data historical data and we're going to be keeping it here now one day my friend told me basar why are you lifting water manually can't you buy a motor pump what my friend told are basar daily taking the bucket and water and you're slowly getting the water and dumping it here it is not very good you better buy a a motor pump so I go I go and buy a motor pump what are the motor pumps these are the motor pumps right these are different different companies motor pumps I have so what I'm trying to do I'm trying to either use this or I may use this so this is what is called as a ETL clear so let me just go and let me copy them into one more paint so I'm going to be where is this pump let me take this motor pumps okay and I go I wanted to paste them here now you know what what is this called as this is what is called as your ETL remember this this is what is called ETL ETL means extracting the data transforming the data loading the data it will be connected to one tank right it may be be connecting to for example if I take this pump if I take this pump it will be connecting to one tank or two tanks it will be taking the data from here and water from here and this is going to be putting the data into to my tank right so this is what is called a tank or you can say a drum so you have different different etls you have you know this is uh one uh 1.5 HP motor water pump max pressure is the uh thing so this may be coming from one more company this may be coming from one more company you have different etls this is coming from one company this may be one more company this may be one more this is Crompton this may be some other company like that Microsoft ETL is called ssis like that we have one more company called you know data stage that is one more Motor Company so this may be your Azure data Factory or this motor may be coming from different he's brand right like that we have data stas eono lot of ETL tools are available or Informatica so all of them are ETL tools ETL tools are nothing but your motor pump remember that so when I connect this so what will happen now so I go I switch on the motor within fraction of seconds all the water can go here so what is ETL ETL means a data pumping tool what is ETL data pumping tool it is motor pump clear you switch on the motor it will take the water and dump into the destination tank now I'm I'm not going to be using here water I'm going to be using a data clear so ssis is going to be taking the data dumping the data to the staging uh to the uh final drum but my mother told hey basar the water is not so cleaned the water need to be cleaned before dumping into the final drum so what I'm trying to do I'm trying to get all the water and initially this water I'm going to be keeping into one more drum what is it drum called as staging staging means temporary what I do I clean the water I clean the water after that cleaned water I'm going to be using one more motor pump using this I'm going to be dumping into final drum so this is also one more drum or the tank but this will be having mud data bad data or bad water or uncleaned water uncleaned water will be cleaned here and after that cleaned water I'm going to be storing in the data warehousing clear so for that from here to here if I wanted to put the water I need to have one more motor pump this is one more motor pump I'm going to be using this water comes here it will be cleaned and from here to from here I'm going to be using one more motor pump all the water will go like that fresh data will come and every month you're going to be taking and dumping the data or every 3 months you're going to be taking and dumping the data app means data warehouse data warehouse is going to be having what large amount of data every day dat you're going to be having your oltp missions are not going to be having fresh data you go to banking application you ask what was my balance in the year 2000 you ask this question to your State Bank of India I have opened my account in 1995 in State Bank of India you ask and go there I want to know what was my balance in year 2000 January 1st so they say you we don't have your latest data the data is gone to State Bank of India data warehousing why because State Bank of India is a oltp application it may use last one year data it may store last six months data or last three years data only clear so State Bank of India or phone pay application or amazon.com when you go you don't see all the historical data you may be seeing last 100 transactions last one year transactions like that remaining data from phone pay would have gone to phone pay data warehouse clear so every month they will be pumping the data from phone pay oltp to phone pay data warehouse data warehouse is called as olap clear historical data will be there in the data warehousing fresh data this month data will be available or this year data only available in your olp applications now I have the data in the refrigerator vegetables I have can I get the food I am getting the vegetables and chicken and all of them I'm cleaning and storing in the refrigerator can you get the food no no no no you have to use kitchen and process the food so this is my refrigerator I have to go to the kitchen so I need to process now we are going to be using one more application here this application is called ssas clear all the data raw data is available here if the data is available nobody can give you food right you have to go to the kitchen and you need to prepare the food so the kitchen is called as your ssas application you process the data here process means what you gr B them you cook them you do lot of circus right you're going to be processing the food clear so here I have some let us say 10 is there 20 is there I want to know what is the total 30 30 will be happening in the cube so this is what is called ssas where we are going to be creating a cube right ssas where we're going to be processing the food so it is going to be your kitchen clear so what is my total total will be 30 what is my minimum minimum is 10 what is my maximum maximum is my 20 what is the average average will be 15 like that it is going to be taking the millions of data and try to do the processing I want to know what is my sales you cannot go to you know fridge and get all the vegetables and counting it right so you have the data into the data warehousing you cannot go and start counting it you have billions of data in the data warehousing so somebody has to process the data there is one more application called ssas so this application is going to be uh processing your data clear what is a Sol total sales what is the total sales in India what is the total sales in South India what is the total sales in last five years what is the total sales in only for the Apple iPhone products last five years like that lot of question need to be answered lot of question need to be answered so this is how you are going to be using one more application called ssas application now you you have gone to the kitchen food is prepared after that what you do you're going to be serving the food right so neatly you need to take the food from the kitchen and all the processed food need to be served for the processing purpose we are going to be using ssas for the serving purpose I'm going to be using a plate right neatly you take a plate and use of the food so here you're going to be garnishing the food now this is where we're going to be using power ba or your Excel also can be used as a reporting tool or we can use the table clear so what are them all these are called as your uh plates where people can see the process data in the charts clear so this reports can be seen by your uh people people which people your decision makers the company managers will see the data in the plate and the plate means in the uh Power ba so power ba will get the data from the plate so all the historical data daily bases is going to be storing in the data warehousing month by month month by month you're going to be dumping the data and all the data is going to be stored in the data warehouse data warehouse is a data storage technology clear so who's going to be processing the the data kitchen kitchen means ssas this application is going to be uh storing your data we may use power ba or we may use SSRS anyone application is going to be you know used here and to create a reports these reports will be given to the decision makers to take the decisions clear so this application is called as a business intelligence application or data analytics application get the data clean the data store the data process the data so these activities from here to here get the data clean the data store the data process the data so these things are called as data engineering activities clear so data engineering if you wanted to become so we are not using a you know chart GPD kind of an applications here but you need to do data engineering means all these things then you going to be creating a reports clear so if you wanted to analyze what is a data an analytics if you want to understand this so you have to perform data engineering and finally you create a report clear data engineering means you need to get the data from SQL you need to understand data warehousing ETL and ssas and all of them you need to know reporting purpose we are going to be using what power ba or you may use even Excel or you may use anything else or SSRS any reporting tool you can use in the companies 70% of the time we're going to be working with the data engineering only 30% of the time we're going to be working with the power ba reports clear so data analytics or business intelligence means you have to know data engineering ing plus report creation this is data analytics or what is the data science or AI so we're talking about artificial intelligence data engineering plus here you're going to be using a ml algorithms clear so data engineering is the common everywhere and on top of that you give a lot of models and you're going to be doing a lot of this one here we're going to be using lot of AI models and everything right we going to be using deep learning AA models will be used so this is what is called artificial intelligence or machine learning means clear so you need to know all these things if you want to become a data scientist or a data analyst means these are the skills this will take a lot of time to understand this AI will take one and a half year to understand and we're going to be doing this skills and this skills so if you don't have a time at least learn data engineering and work work work with only this but here we are also going to use uh spark right so here we may not use maximum time spark but maximum time we're going to be getting the data from the data V housing but here we're going to be using big data as well clear if you wanted to use your data who's going to be using your data business users managers will take the data they will analyze the company how the company is you know growing coming you know sales are spiking or decreasing or increasing month by month they will analyze year by year they will analyze quarter by quarter they'll analyze so you have historical data every any any year data any minute data they will be able to analyze with the help of olap olap means online analytical processing they can analyze any type of data why complete data is going to be there in the data warehouse but your oltp systems will have only fresh data L one month data L one year data you can have it so you cannot perform any data anal itics here you're going to be storing the data and data warehousing machines are called as completely your olab machines where you can analyze the data the data will be also available in the terabytes clear you go to a temple you gives a lot of gifts right there is a hundi there's a box gift box you put lot of items gold uh you know cash a lot of items are going to be putting into hundi right when the when the uh once the hundi is full after after one month what what they do they get they take all the small small hundies they will take all the gifts all the cash from the hundies and all of them will be uh stored in the bigger room and they'll count it right again the empty hundi will be placed in the temples and again you go you keep the fresh gifts fresh cash right again after one month the hundes are empty hundes are nothing but your oil TP storages fresh gifts are going to be stored there so on the H is full once in a 15 days once in a 10 days you're going to be taking the items and you're going to be storing them into the a bigger Locker the bigger Locker is your data warehousing clear gold will be stored separately gift items are stored separately cash items are stored separately you're going to be organizing the items into the data warehousing separately again the process is going to be continuing so fresh data one month gifts will be there in the oltp but last 20 years of the gifts and the cash would have been captured and kepted into the bigger lockers of your temples or the churches or the mid clear that is how we need to understand fresh items and the old items fresh gifts and the old items clear so this is what is called as complete bi architecture bi is completely collecting the data storing the data and you're going to be working with the SQL and data warehousing and everything clear I will take another one minute and complete so we are going to be learning all these things in your uh data analytics what are we going to be learning we are going to be learning all these things right not one thing not two things so this is what is called data analytics SQL Server you need to know power ba you need to know ssis we are going to be learning and data warehouse we're going to be learning ssas also we're going to be learning so this month and next month I'm going to be concentrating on powerb and SQL Server SQL Server live classes we don't have I'm going to be giving you recorded videos clear you can take daily one video 45 videos will be given daily you need to watch one video from tomorrow if you make the payment all the sequels of videos will be given without SQL learning power ba or ssis or data warehousing or any technology is useless daily need to watch one SQL video and you need to come back uh come back to me with the questions right and from tomorrow we are going to be having power B live classes we'll complete this month and next month only these two topics this will be video training this will be live training so next in third month I'm going to be giving you ssis and data warehousing and ssas this will be this can be completed within one month so total three months course all of them you need to know C clear so these are the things you're going to be learning as a data analyst if you wanted to become a data uh engineer you need to know SQL and lot of other things you need to learn data factory data bricks python blah blah blah data bricks is nothing but your big data spark clear and all of them we're going to be learning so as a data engineering data science means you're going to be learning lot of other things clear so summary data analytics means understand the past that science means you know predict the future AI means make the machine intelligent using a clustering classification regression algorithms clear so we are not going to be going into eii so you have to uh you know perform you can become a data analytics guy within three months or data an engineering guy within three months but data science will be very very difficult activity so data analytics no coding low coding data engineering medium level coding you need to know python AI will have more coding and more stats and maths you need to have it clear guys so this is what is called end of the uh demos now SQL Server we're going to be teaching it is a Microsoft tool I will give you brief introduction some history about this so SQL Server is a Microsoft rdb BMS software okay this software is released first version in 1989 9 completely it is SQL SQL means it can store the data dbms software it is then you have 5.0 then they have 7.0 then they have released SQL Ser 2000 clear so later they have released one more version 2005 when it has got released you can see in 2005 clear so this will be dbms software in this only Microsoft added msba so these three tools got added there itself okay so what are the these three tools these are the three that you need to know what is meaning ssis it is nothing but SQL Server okay okay SS means SQL Server integration services this was the ETL tool Microsoft using then we have ssas what is ssas it was the analysis tool what analysis tool mean it's is going to be processing your data it is a kitchen that is what I told you right the last tool was SS RS it was the reporting tool but this is completely stopped people are not using this instead of this tool we are today learning what power ba so you have to learn these two tools but you are not learning SSRS we are going to be learning what power ba power ba Advanced tool to SSRS clear and power ba is very very easy so you will be able to easily understand plus so we're going to be understand these three things so and you need to know compulsorily you have to learn SQL Server right so this is the version we had now lot of versions got released 2008 now 202 right now 2015 2017 and 2019 current version is 2022 I will give you 2019 or 2022 tomorrow we will install SQL and msba right we're going to be working with ssis and ssas as well SSRS we are not going to be using it so this will be for 25 days course it will be done in the third month and ssas will be five days clear so power ba we're going to be learning so this will be for again 45 days so this month and next month I'm going to be targeting towards power ba only and SQL Server is also approximately 45 videos 45 hours videos are going to be provided you will be able to start these things first all these things you need to learn plus we're going to be learning One More Concept data warehouse also clear this will be for 10 hours all of them are included if you learn only Power ba that is of no use clear so we are going to be learning everything so this is what is called data analytics that is the reason we're going to to be taking much more time to work with this if you have any questions you are welcome so this is the final demo and from tomorrow this meeting ID is going to change and for the payment links and everything you wait till afternoon 1:00 or 12:00 you will be able to get the payment links and discounts and if you don't get the discounts you can call them you can ask them or you can also ask me if you have any questions this is my direct number I have given you wait for the payment links if you fill the form I have given you one form to fill right this form I have given you so that you are eligible for the daily recorded videos are going to be given the course validity for the six months clear this month we're going to be having morning 8:00 next month onwards your batch is going to be starting morning 7:00 remember this point till end of the course next month onwards it will be starting from morning 7:00 Indian timing it'll going to start from from next month till end this month only this batch will be from 8:00 to 9:00 so you need to plan tomorrow you cannot complain all of them I'm just telling so daily recorded videos are going to be given the course validity is 6 months clear Cho guys thank you so much if you have any questions or welcome

Share your thoughts