Available courses

Business Analytics is the practice of iterative, methodical exploration of an organization's data, with an emphasis on statistical analysis. Business Analytics is used by companies committed to data-driven decision-making. It is about using your data to derive information, insights, knowledge, and recommendations. Businesses use business analytics to improve effectiveness and efficiency of their solutions.

In this module, I will talk about how analytics has progressed from simple descriptive analytics to being predictive and prescriptive. I will also talk about multiple examples to understand these better, and discuss various industry use cases. I will also introduce multiple components of big data analysis including data mining, machine learning, web mining, natural language processing, social network analysis, and visualization in this module. Lastly, I will provide some tips for learners of data science to succeed in learning and applying data science successfully for their projects.

Python and R are the two most popular programming languages for data scientists as of now. Python is an interpreted high-level programming language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, notably using significant whitespace. Python is open source, has awesome community support, is easy to learn, good for quick scripting as well as coding for actual deployments, good for web coding too.

In this module, I will start with basics of the Python language. We will do both theory as well as hands-on exercises intermixed. I will use Jupyter notebooks while doing hands-on. I will also discuss in detail topics like control flow, input output, data structures, functions, regular expressions and object orientation in Python. Closer to data science, I will discuss about popular Python libraries like NumPy, Pandas, SciPy, Matplotlib, Scikit-Learn and NLTK.

While Python has been used by many programmers even before they were introduced to data science, R has its main focus on statistics, data analysis, and graphical models. R is meant mainly for data science. Just like Python, R has also has very good community support. Python is good for beginners, R is good for experienced data scientists. R provides the most comprehensive statistical analysis packages.

In this module, I will again talk about both theory as well as hands-on about various aspects of R. I will use the R Studio for hands-on. I will discuss basic programming aspects of R as well as visualization using R. Then, I will talk about how to use R for exploratory data analysis, for data wrangling, and for building models on labeled data. Overall, I will cover whatever you need to do good data science using R.

Probability and Statistics helps in understanding whether data is meaningful, including inference, testing, and other methods for analyzing patterns in data and using them to predict, understand, and improve results.

We live in an uncertain and complex world, yet we continually have to make decisions in the present with uncertain future outcomes. To study, or not to study? To invest, or not to invest? To marry, or not to marry? This is what is captured mathematically using the notion of probability. Statistics on the other hand, helps us analyze data sets, and correctly interpret results to make solid, evidence-based decisions.

In this module, I will discuss some very fundamental terms/concepts related to probability and statistics that often come across any literature related to Machine Learning and AI. Key topics include quantifying uncertainty with probability, descriptive statistics, point and interval estimation of means, central limit theorem, and the basics of hypothesis testing.

Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine Learning is a first-class ticket to the most exciting careers in data science. As data sources proliferate along with the computing power to process them, automated predictions have become much more accurate and dependable. Machine learning brings together computer science and statistics to harness that predictive power. It’s a must-have skill for all aspiring data analysts and data scientists, or anyone else who wants to wrestle all that raw data into refined trends and predictions.

In this module, broadly I will talk about supervised as well as unsupervised learning. We will talk about multiple types of classifiers like Naïve Bayes, KNN, decision trees, SVMs, artificial neural networks, logistic regression, and ensemble learning. Further, we will also talk about linear regression analysis, sequence labeling using HMMs. As part of unsupervised learning, I will discuss clustering as well as dimensionality reduction. Finally, we will also discuss briefly about semi-supervised learning, mult-task learning, architecting ML solutions, and a few ML case studies.

Project 1: Learning various classifiers on Iris dataset

Project 2: MLP for hand-written digit recognition

Project 3: Logistic regression on the titanic dataset

Project 4: Use CoNLL 2002 data to build a NER system