Practical Step-by-Step Course for Beginners. In this course, we will step by step, using the example of real data, we will go through the main processes related to the topic “Big data and machine learning”.
Login
€379.00€189.00
Description
🎓 This course is intended to be an initiation to learn #BigData and #MachineLearning with #Python programming for absolute beginners that have no background in programming.
In this course, we will step by step, using the example of real data, we will go through the main processes related to the topic “Big data and machine learning”. Since the material turned out to be voluminous, I divided the course into five parts.
📑 The first part is devoted to the collection and extraction of data from documents. ✔️ In this course, you will learn how to extract data from PDF documents, drawings and any other documents in PDF format. We will have two sets of data consisting of PDF files that we will transform to the text and to tabular form. We will visualize the received data on the Kaggle platform using python libraries, which will help us to depict our received data in a graphical format. ✔️ During the training process, we will install Python and such libraries as Pandas, seaborn, matplotlib and others. We will upload the received data to the Kaggle platform and here using the “Jupiter Notebook” we will visualize our data and at the end, we will upload our data to the GitHub platform.
📑 The Second part is devoted to the collection and extraction of data from scanned documents and Images. In this course, you will learn how to extract data from From Scanned Documents And Images, invoices, receipts, contracts and any other documents in PDF format or in Image format. ✔️We will work on real data. We will have two sets of data consisting of PDF files that we will transform to the text and to tabular form. We will visualize the received data on the Kaggle platform using python libraries, which will help us to depict our received data in a graphical format. ✔️ During the training process, we will install Python and such libraries as Pandas, seaborn, matplotlib and others. We will upload the received data to the Kaggle platform and here using the “Jupiter Notebook” we will visualize our data and at the end, we will upload our data to the GitHub platform.
📑 In third part we will consider the main options for storing big data. ✔️ In practical lesson we will install the MySQL server on computer and learn how to work and edit MySQL databases. In the fifth lesson we will take one regular exel table and transfer the information from this table to the MySql server. ✔️ Then we will install the spark in order to work with datasets in a distributed manner.Then, to process the distributed data, we export the data from MySQL into spark. And with the help of Jupiter Notebook, we prepare the data for visualization of this data.
📑 In fourth part we will look at the main platforms for visualizing Big Data and consider the main Data Visualization Online-Tools for Big Data. ✔️ We will briefly look at these platforms and generate several reports in each of the platforms. This will give you the opportunity to choose the right platform that suits you and your data. ✔️ In practical lesson we exported an excel file with our data to the Kaggle platform and using a Jupyter Notebook we cleared the data and visualized the data using different python libraries.
📑 In fifth part we will examine in detail the basic types, terms and algorithms of machine learning. We go through the basic concepts of machine learning that beginners need. We will consider in more detail such algorithms as K-means supervised Machine Learning, Linear Regression and other algorithms for Machine Learning. ✔️ In practical lessons we will predict the time and cost of construction for the new project X, based on the data that we collected on previous projects. And in another lesson we will predict the cost of building project X and construction time by the parameters that we will set for the new project x ✔️ Then we take open source data for the San Francisco city. We will clear this raw data and display the data in the form of a charts and maps. We will collect various interesting insights from this public information. Then we will prepare the data to create a machine learning model and try to predict some parameters from this data.
We use cookies in our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of the cookies explicitly. Visit Cookie Settings to know more about the cookies used on our website.
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.