Machine Learning, Data Science and Generative AI with Python
 Description
 Curriculum
 FAQ
 Reviews
Unlock the Power of Machine Learning & AI: Master the Art of Turning Data into Insight
Discover the Future of Technology with Our Comprehensive Machine Learning & AI Course – Featuring Generative AI, Deep Learning, and Beyond!
In an era where Machine Learning (ML) and Artificial Intelligence (AI) are revolutionizing industries across the globe, understanding how giants like Google, Amazon, and Udemy leverage these technologies to extract meaningful insights from vast data sets is more critical than ever. Whether you’re aiming to join the ranks of toptier AI specialistsâ€”with an average salary of $159,000 as reported by Glassdoorâ€”or you’re driven by the fascinating challenges this field offers, our course is your gateway to an exciting new career trajectory.
Designed for individuals with programming or scripting backgrounds, this course goes beyond the basics, preparing you to stand out in the competitive tech industry. Our curriculum, enriched with over 145 lectures and 20+ hours of video content, is crafted to provide handson experience with Python, guiding you from the fundamentals of statistics to the cuttingedge advancements in generative AI.
Why Choose This Course?

Updated Content on Generative AI: Dive into the latest in AI with modules on transformers, GPT, ChatGPT, the OpenAI API, Advanced Retrieval Augmented Generation (RAG), LLM agents, langchain, and selfattention based neural networks.

RealWorld Application: Learn through Python code examples based on reallife scenarios, making the abstract concepts of ML and AI tangible and actionable.

IndustryRelevant Skills: Our curriculum is designed based on the analysis of job listings from top tech firms, ensuring you gain the skills most sought after by employers.

Diverse Topics Covered: From neural networks, TensorFlow, and Keras to sentiment analysis and image recognition, our course covers a wide range of ML models and techniques, ensuring a wellrounded education.

Accessible Learning: Complex concepts are explained in plain English, focusing on practical application rather than academic jargon, making the learning process straightforward and engaging.
Course Highlights:

Introduction to Python and basic statistics, setting a strong foundation for your journey in ML and AI.

Deep Learning techniques, including MLPs, CNNs, and RNNs, with practical exercises in TensorFlow and Keras.

Extensive modules on the mechanics of modern generative AI, including transformers and the OpenAI API, with handson projects like finetuning GPT, Advanced RAG, langchain, and LLM agents.

A comprehensive overview of machine learning models beyond GenAI, including SVMs, reinforcement learning, decision trees, and more, ensuring you have a broad understanding of the field.

Practical data science applications, such as data visualization, regression analysis, clustering, and feature engineering, empowering you to tackle realworld data challenges.

A special section on Apache Spark, enabling you to apply these techniques to big data, analyzed on computing clusters.
No previous Python experience? No problem! We kickstart your journey with a Python crash course to ensure you’re wellequipped to tackle the modules that follow.
Transform Your Career Today
Join a community of learners who have successfully transitioned into the tech industry, leveraging the knowledge and skills acquired from our course to excel in corporate and research roles in AI and ML.
“I started doing your course… and it was pivotal in helping me transition into a role where I now solve corporate problems using AI. Your course demystified how to succeed in corporate AI research, making you the most impressive instructor in ML I’ve encountered.” – Kanad Basu, PhD
Are you ready to step into the future of technology and make a mark in the fields of machine learning and artificial intelligence? Enroll now and embark on a journey that transforms data into powerful insights, paving your way to a rewarding career in AI and ML.

1Introduction
What to expect in this course, who it's for, and the general format we'll follow.

2Udemy 101: Getting the Most From This Course

3Important note

4Installation: Getting Started

5[Activity] WINDOWS: Installing and Using Anaconda & Course Materials

6[Activity] MAC: Installing and Using Anaconda & Course Materials

7[Activity] LINUX: Installing and Using Anaconda & Course Materials

8Python Basics, Part 1 [Optional]
In a crash course on Python and what's different about it, we'll cover the importance of whitespace in Python scripts, and how to import Python modules.

9[Activity] Python Basics, Part 2 [Optional]
In part 2 of our Python crash course, we'll cover Python data structures including lists, tuples, and dictionaries.

10[Activity] Python Basics, Part 3 [Optional]
In this lesson, we'll see how functions work in Python.

11[Activity] Python Basics, Part 4 [Optional]
We'll wrap up our Python crash course covering Boolean expressions and looping constructs.

12Introducing the Pandas Library [Optional]
Pandas is a library we'll use throughout the course for loading, examining, and manipulating data. Let's see how it works with some examples, and you'll have an exercise at the end too.

13Types of Data (Numerical, Categorical, Ordinal)
We cover the differences between continuous and discrete numerical data, categorical data, and ordinal data.

14Mean, Median, Mode
A refresher on mean, median, and mode  and when it's appropriate to use each.

15[Activity] Using mean, median, and mode in Python
We'll use mean, median, and mode in some real Python code, and set you loose to write some code of your own.

16[Activity] Variation and Standard Deviation
We'll cover how to compute the variation and standard deviation of a data distribution, and how to do it using some examples in Python.

17Probability Density Function; Probability Mass Function
Introducing the concepts of probability density functions (PDF's) and probability mass functions (PMF's).

18Common Data Distributions (Normal, Binomial, Poisson, etc)
We'll show examples of continuous, normal, exponential, binomial, and poisson distributions using iPython.

19[Activity] Percentiles and Moments
We'll look at some examples of percentiles and quartiles in data distributions, and then move on to the concept of the first four moments of data sets.

20[Activity] A Crash Course in matplotlib
An overview of different tricks in matplotlib for creating graphs of your data, using different graph types and styles.

21[Activity] Advanced Visualization with Seaborn

22[Activity] Covariance and Correlation
The concepts of covariance and correlation used to look for relationships between different sets of attributes, and some examples in Python.

23[Exercise] Conditional Probability
We cover the concepts and equations behind conditional probability, and use it to try and find a relationship between age and purchases in some fabricated data using Python.

24Exercise Solution: Conditional Probability of Purchase by Age
Here we'll go over my solution to the exercise I challenged you with in the previous lecture  changing our fabricated data to have no real correlation between ages and purchases, and seeing if you can detect that using conditional probability.

25Bayes' Theorem
An overview of Bayes' Theorem, and an example of using it to uncover misleading statistics surrounding the accuracy of drug testing.

26[Activity] Linear Regression
We introduce the concept of linear regression and how it works, and use it to fit a line to some sample data using Python.

27[Activity] Polynomial Regression
We cover the concepts of polynomial regression, and use it to fit a more complex page speed  purchase relationship in Python.

28[Activity] Multiple Regression, and Predicting Car Prices
Multivariate models let us predict some value given more than one attribute. We cover the concept, then use it to build a model in Python to predict car prices based on their number of doors, mileage, and number of cylinders. We'll also get our first look at the statsmodels library in Python.

29MultiLevel Models
We'll just cover the concept of multilevel modeling, as it is a very advanced topic. But you'll get the ideas and challenges behind it.

30Supervised vs. Unsupervised Learning, and Train/Test
The concepts of supervised and unsupervised machine learning, and how to evaluate the ability of a machine learning model to predict new values using the train/test technique.

31[Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression
We'll apply train test to a real example using Python.

32Bayesian Methods: Concepts
We'll introduce the concept of Naive Bayes and how we might apply it to the problem of building a spam classifier.

33[Activity] Implementing a Spam Classifier with Naive Bayes
We'll actually write a working spam classifier, using real email training data and a surprisingly small amount of code!

34KMeans Clustering
KMeans is a way to identify things that are similar to each other. It's a case of unsupervised learning, which could result in clusters you never expected!

35[Activity] Clustering people based on income and age
We'll apply KMeans clustering to find interesting groupings of people based on their age and income.

36Measuring Entropy
Entropy is a measure of the disorder in a data set  we'll learn what that means, and how to compute it mathematically.

37[Activity] WINDOWS: Installing Graphviz

38[Activity] MAC: Installing Graphviz

39[Activity] LINUX: Installing Graphviz

40Decision Trees: Concepts
Decision trees can automatically create a flow chart for making some decision, based on machine learning! Let's learn how they work.

41[Activity] Decision Trees: Predicting Hiring Decisions
We'll create a decision tree and an entire "random forest" to predict hiring decisions for job candidates.

42Ensemble Learning
Random Forests was an example of ensemble learning; we'll cover over techniques for combining the results of many models to create a better result than any one could produce on its own.

43[Activity] XGBoost
XGBoost is perhaps the most powerful machine learning algorithm today, and it's really easy to use. We'll cover how it works, how to tune it, and run an example on the Iris data set showing how powerful XGBoost is.

44Support Vector Machines (SVM) Overview
Support Vector Machines are an advanced technique for classifying data that has multiple features. It treats those features as dimensions, and partitions this higherdimensional space using "support vectors."

45[Activity] Using SVM to cluster people using scikitlearn
We'll use scikitlearn to easily classify people using a CSupport Vector Classifier.

46UserBased Collaborative Filtering
One way to recommend items is to look for other people similar to you based on their behavior, and recommend stuff they liked that you haven't seen yet.

47ItemBased Collaborative Filtering
The shortcomings of userbased collaborative filtering can be solved by flipping it on its head, and instead looking at relationships between items instead of relationships between people.

48[Activity] Finding Movie Similarities using Cosine Similarity
We'll use the realworld MovieLens data set of movie ratings to take a first crack at finding movies that are similar to each other, which is the first step in itembased collaborative filtering.

49[Activity] Improving the Results of Movie Similarities
Our initial results for movies similar to Star Wars weren't very good. Let's figure out why, and fix it.

50[Activity] Making Movie Recommendations with ItemBased Collaborative Filtering
We'll implement a complete itembased collaborative filtering system that uses realworld movie ratings data to recommend movies to any user.

51[Exercise] Improve the recommender's results
As a student exercise, try some of my ideas  or some ideas of your own  to make the results of our itembased collaborative filter even better.

52KNearestNeighbors: Concepts
KNN is a very simple supervised machine learning technique; we'll quickly cover the concept here.

53[Activity] Using KNN to predict a rating for a movie
We'll use the simple KNN technique and apply it to a more complicated problem: finding the most similar movies to a given movie just given its genre and rating information, and then using those "nearest neighbors" to predict the movie's rating.

54Dimensionality Reduction; Principal Component Analysis (PCA)
Data that includes many features or many different vectors can be thought of as having many dimensions. Often it's useful to reduce those dimensions down to something more easily visualized, for compression, or to just distill the most important information from a data set (that is, information that contributes the most to the data's variance.) Principal Component Analysis and Singular Value Decomposition do that.

55[Activity] PCA Example with the Iris data set
We'll use sckikitlearn's builtin PCA system to reduce the 4dimensions Iris data set down to 2 dimensions, while still preserving most of its variance.

56Data Warehousing Overview: ETL and ELT
Cloudbased data storage and analysis systems like Hadoop, Hive, Spark, and MapReduce are turning the field of data warehousing on its head. Instead of extracting, transforming, and then loading data into a data warehouse, the transformation step is now more efficiently done using a cluster after it's already been loaded. With computing and storage resources so cheap, this new approach now makes sense.

57Reinforcement Learning
We'll describe the concept of reinforcement learning  including Markov Decision Processes, QLearning, and Dynamic Programming  all using a simple example of developing an intelligent PacMan.

58[Activity] Reinforcement Learning & QLearning with Gym

59Understanding a Confusion Matrix
What's a confusion matrix, and how do I read it?

60Measuring Classifiers (Precision, Recall, F1, ROC, AUC)

61Bias/Variance Tradeoff
Bias and Variance both contribute to overall error; understand these components of error and how they relate to each other.

62[Activity] KFold CrossValidation to avoid overfitting
We'll introduce the concept of KFold CrossValidation to make train/test even more robust, and apply it to a real model.

63Data Cleaning and Normalization
Cleaning your raw input data is often the most important, and timeconsuming, part of your job as a data scientist!

64[Activity] Cleaning web log data
In this example, we'll try to find the topviewed web pages on a web site  and see how much data pollution makes that into a very difficult task!

65Normalizing numerical data
A brief reminder: some models require input data to be normalized, or within the same range, of each other. Always read the documentation on the techniques you are using.

66[Activity] Detecting outliers
A review of how outliers can affect your results, and how to identify and deal with them in a principled manner.

67Feature Engineering and the Curse of Dimensionality

68Imputation Techniques for Missing Data

69Handling Unbalanced Data: Oversampling, Undersampling, and SMOTE

70Binning, Transforming, Encoding, Scaling, and Shuffling

71Warning about Java 21+ and Spark 3!

72Spark installation notes for MacOS and Linux users

73[Activity] Installing Spark
We'll present an overview of the steps needed to install Apache Spark on your desktop in standalone mode, and get started by getting a Java Development Kit installed on your system.

74Spark Introduction
A highlevel overview of Apache Spark, what it is, and how it works.

75Spark and the Resilient Distributed Dataset (RDD)
We'll go in more depth on the core of Spark  the RDD object, and what you can do with it.

76Introducing MLLib
A quick overview of MLLib's capabilities, and the new data types it introduces to Spark.

77Introduction to Decision Trees in Spark
We'll walk through an example of coding up and running a decision tree using Apache Spark's MLLib! In this exercise, we try to predict if a job candidate will be hired based on their work and educational history, using a decision tree that can be distributed across an entire cluster with Spark.

78[Activity] KMeans Clustering in Spark
We'll take the same example of clustering people by age and income from our earlier KMeans lecture  but solve it in Spark!

79TF / IDF
We'll introduce the concept of TFIDF (Term Frequency / Inverse Document Frequency) and how it applies to search problems, in preparation for using it with MLLib.

80[Activity] Searching Wikipedia with Spark
Let's use TFIDF, Spark, and MLLib to create a rudimentary search engine for real Wikipedia pages!

81[Activity] Using the Spark DataFrame API for MLLib
Spark 2.0 introduced a new API for MLLib based on DataFrame objects; we'll look at an example of using this to create and use a linear regression model.

82Deploying Models to RealTime Systems
Highlevel thoughts on various ways to deploy your trained models to production systems including apps and websites.

83A/B Testing Concepts
Running controlled experiments on your website usually involves a technique called the A/B test. We'll learn how they work.

84TTests and PValues
How to determine significance of an A/B tests results, and measure the probability of the results being just from random chance, using TTests, the Tstatistic, and the Pvalue.

85[Activity] Handson With TTests
We'll fabricate A/B test data from several scenarios, and measure the Tstatistic and PValue for each using Python.

86Determining How Long to Run an Experiment
Some A/B tests just don't affect customer behavior one way or another. How do you know how long to let an experiment run for before giving up?

87A/B Test Gotchas
There are many limitations associated with running shortterm A/B tests  novelty effects, seasonal effects, and more can lead you to the wrong decisions. We'll discuss the forces that may result in misleading A/B test results so you can watch out for them.