Part 1. Python. 1st Dataset. PDF files. Tika OCR. Regular Expression. Array und Function

Part 1. Pandas DataFrame. Kaggle. Jupiter Notebook.

Part 1. Independent Work Tasks. 2nd Dataset.

Part 1. GitHub. Desktop GitHub

Part 2. Python. PyTesseract OCR. Regular Expression. Array und Function.

Part 2. RegEx. Regular Expression in Python.

Part 2. Kaggle. Jupyter Notebook.

Part 3. Big Data Storage and MySQL.

Part 3. Practice. Export Excel worksheet data to a MySQL table

Part 3. A Storage System for Big Data. Hadoop.

Part 3. Practice. How Apache Spark makes your slow MySQL queries 10x faster.

Part 4. The Data Visualisation Tools. Introduction

Part 4. Practice. Data Visualization with Python. Kaggle and Jupyter Notebook.

Part 4. Online Data Visualization Tools. Introduction and getting started.

Part 5. Machine Learning. An Introduction.

Part 5. Practice. How does machine learning work?

Part 5. Workflow of a Machine Learning project.

Part 5. Practice. San Francisco – explore Building Permits Data. Build Predictive Model.

2nd Dataset. Task. Data from PDF. Getting data from PDF drawings.

🔎 Topics covered in this course:

  • Independent Work Tasks

  • Learn to Code – on real data (16 PDF files to chart)

  • A brief overview of the data in the task