Last Minute Action Plan

UNIT-1 INTRODUCTION TO AI

Artificial: Something main made

intelligence : Intelligence can be of a quality of anyone- humans, animals, birds and even machines / Ability to interact with the real world / Ability to acquire information, and to retain it as knowledge and apply knowledge and skills in various domains such as making decisions, solving problems, creating new things, choosing correct tool, path or people in specific situation

Type of Intelligence: (9 types ) / SMILE NK ( Spatial Visual ,Mathematical ,Musical , Intrapersonal , Interpersonal , Linguistic , Existential , Naturalist ,Kineasthetic )

Interpersonal: Ability to understand and communicate effectively with others / Intera-personal: Ability to understand oneself and one's thoughts and feelings/ Existential relating to religious and spiritual awareness/Linguistic: Language related may be written or spoken skill

Decision making: Process of picking up a final choice from a set of available choices after assessment / We can’t make “good” decisions without information /information may be past experience, intuition, knowledge and self awareness

AI : When a machine possesses the ability to mimic human traits, i.e., make decisions, predict the future, learn and improve on its own, it is said to have artificial intelligence /you can say that a machine is artificially intelligent when it can accomplish tasks by itself - collect data, understand it, analyses it, learn from it, and improve it

Any machine that has been trained with data and can make decisions/predictions on its own can be termed as AI. Here, the term ‘training’ is important.

AI Can do : AI based system can discover patterns from the available information / Can make decision / Can converse in natural language / can recognize and read from images

What AI IS NOT : AI is not just automation/No emotions / Not magic it is math and algorithm/ AI is not single entity like human or animal but it composed of multiple programs, lot of data and information

AI means the use of intelligence and not just the automation like Automatic Washing machine or smart TV or Smart AC

Language used for AI: JAVA, PYTHON, PERL,LISP, PROLOG

Applications of Artificial Intelligence around us : Google Search engine, Google Map, Google Assistant(Smart assistant) , Speech recognition(speech to text), Sentiment analysis, Digital phone calls, chatbots, targeted advertising , fraud and risk detection, Weather prediction, Price Comparison Websites , Self-Driving cars, Face Lock in Smartphone’s , Email filters

Chatbots and Virtual assistants based on NLP

Machine Learning (ML) It is a subset of AI which enables machines to improve at tasks with experience (data). It enables machines to learn by themselves using the provided data and make accurate Predictions/ Decisions.

Deep Learning (DL): It enables software to train itself to perform tasks with vast amounts of data. Deep Learning is the most advanced form of Artificial Intelligence out of these three. Deep learning is a subset of machine learning that uses artificial neural networks to mimic the learning process of the human brain.

Big Data: huge amounts of data, which is regularly growing at an exponential rate for e.g data of social media (post, pictures, responses, users etc.). Big data cannot be handled without AI

AI Domains :(3) Data Sciences/ Computer Vision /Natural Language Processing (NLP)

Common Misconceptions about AI: AI will take your job/AI does not require humans/AI is harmful for people

AI Ethics: is a set of moral principles which help us discern between right and wrong. AI ethics is a multidisciplinary field that studies how to optimize AI's beneficial impact while reducing risks and adverse outcomes. Examples of AI ethics issues include data responsibility and privacy, fairness, transparency, environmental sustainability, moral agency, value alignment, accountability, trust, and technology misuse. This means taking a safe, secure, humane, and environmentally friendly approach to AI. A strong AI code of ethics can include avoiding bias, ensuring privacy of users and their data, and mitigating environmental risks.

UNIT-7 Evaluation

Evaluation is the process of understanding the reliability of any AI model, based on outputs by feeding test dataset into the model and comparing with actual answers.

· A confusion matrix is a matrix that summarizes the performance of a machine learning model on a set of test data. It is often used to measure the performance of classification models. The matrix displays the number of true positives (TP), True negatives (TN), false positives (FP), and false negatives (FN) produced by the model on the test data. Prediction and Reality can be easily mapped together with the help of this confusion matrix.

· TP: Prediction and reality are positive

· TN: prediction and reality are negative

· FP: Prediction positive but reality is negative

· FN: Prediction negative but reality is Positive

TP AND TN ARE CORRECT RESULT OR DECISION OF AI MODEL

FP AND FN ARE ERRORS OR INCORRECT RESULS OF AI MODEL

· Prediction:(prediction by the model) TRUE/FALSE

· Actual Result: POSITIVE/NEGETIVE

Accuracy rate: Percentage of correct predictions out of all the observations.

(CORRECT PREDICTION/TOTAL CASES) * 100 OR (TP+TN)/(TP+TN+FP+FN) *100

Precision rate: (True positive out of all positive)

It is defined as the percentage of true positive cases versus all the cases where the prediction is true (TRUE POSITIVE/ALL PREDICTED POSITIVE) * 100 OR (TP)/(TP+FP) * 100

Recall: (rate of correct positive predictions))

It can be defined as the fraction of positive cases that are correctly identified. RECALL=(TP/TP+FN)*100

(It can be defined as the fraction of positive cases that are correctly identified. It majorly takes into account the true reality cases where in Reality there was a fire but the machine either detected it correctly or it didn’t. That is, it considers True Positives (There was a forest fire in reality and the model predicted a forest fire) and False Negatives (There was a forest fire and the model didn’t predict it).)

F1 Score defined as the measure of balance between precision and recall.

2* (Precision*recall)/ (Precision+recall)

When F1 score is high(1 or 100%) , we can say AI model will work efficiently

Which Metric is Important? Choosing between Precision and Recall depends on the condition in which the model has been deployed. In a case like Forest Fire, a False Negative can cost us a lot and is risky too. Imagine no alert being given even when there is a Forest Fire. The whole forest might burn down. Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a deadly virus has started spreading and the model which is supposed to predict a viral outbreak does not detect it. The virus might spread widely and infect a lot of people. On the other hand, there can be cases in which the False Positive condition costs us more than False Negatives. One such case is Mining. Imagine a model telling you that there exists treasure at a point and you keep on digging there but it turns out that it is a false alarm. Here, False Positive case (predicting there is treasure but there is no treasure) can be very costly. Similarly, let’s consider a model that predicts that a mail is spam or not. If the model always predicts that the mail is spam, people would not look at it and eventually might lose important information. Here also False Positive condition (Predicting the mail as spam while the mail is not spam) would have a high cost.

UNIT-6 NLP(Natural Language Processing)

NLP is branch of AI that enable computer to process human language in the form of text or voice data. It is a domain of AI.

Applications of NLP :Automatic Text Summarization(in this approach we build algorithms or programs which will reduce the text size and create a summary of our text data.) / sentiment Analysis: ( opinion mining) technique used to determine whether data is positive, negative or neutral. Used to identify opinions and sentiment online about company product to help them understand what customers think about their products and services / Text classification:(text tagging or text categorization ) is the process of categorizing unstructured text into organized groups. By using NLP, text classifiers can automatically analyze text and then assign a set of pre-defined tags or categories based on its content.

/ Virtual Assistants(Smart Assistants) : Now a day’s Google Assistant, Microsoft's Cortana, Apple's Siri, Amazon's Alexa, etc. NLP Based programs that are automated to communicate in human voice ,mimiking human interaction to help ease your day to day task such showing weather reports, creating reminders, making shopping list etc.

Digital Phone Calls Automated systems direct customer calls to a service representative or online Chatbot's, which respond to customer requests with helpful information/ Chatbot: A chatbot is a computer program that simulates and processes human conversation(written or spoken), allowing human to interact with digital devices as if they were communicating with a real person/

Two type of chatbots: script bot(easy to make,limited functionality, no or limited coding required) and smart bot(flexible and powerful, wide functionality ,coding required, use AI and ML) e.g SMART BOT Google Assistant, Microsoft's Cortana, Apple's Siri, Amazon's Alexa,

Example of chatbot : Mitsuku Bot, Jabberwacky, Rose,CleverBot /Syntax: Syntax refers to the grammatical structure of a sentence./Semantics: It refers to the meaning of the sentence./ Stemming is a technique used to extract the base form of the words by removing affixes from them

NLP takes in the data of Natural Languages in the form of written words and spoken words which humans use in their daily lives and operate on this.

Term frequency is the frequency of a word in one document. Term frequency can easily be found from the document vector table as in that table we mention the frequency of each word of the vocabulary in each document.

Document Frequency is the number of documents in which the word occurs irrespective of how many times it has occurred in those documents.

In the case of inverse document frequency, we need to put the document frequency in the denominator while the total number of documents is the numerator.

For example, if the document frequency of the word “AMAN” is 2 in a particular document then its inverse document frequency will be 3/2. (Here no. of documents is 3)

Term frequency Inverse Document Frequency is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.

The number of times a word appears in a document is divided by the total number of words in the document. Every document has its own term frequency.

Stemming is a technique used to extract the base form of the words by removing affixes from them. It is just like cutting down the branches of a tree to its stems. For example, the stem of the words eating, eats, eaten is eaten.

Stemming is the process in which the affixes of words are removed and the words are converted to their base form.

Lemmatization is the grouping together of different forms of the same word. In search queries, lemmatization allows end-users to query any version of a base word and get relevant results.

Natural Language Toolkit (NLTK). NLTK is one of the leading platforms for building Python programs that can work with human language data.

Example of Multiple meanings of a word –
His face turns red after consuming the medicine Meaning – Is he having an allergic reaction? Or is he not able to bear the taste of that medicine?

Example of Perfect syntax, no meaning-
Chickens feed extravagantly while the moon drinks tea. This statement is correct grammatically but it does not make any sense. In Human language, a perfect balance of syntax and semantics is important for better understanding.

Data sciences

Data science is a domain of AI related to data systems and processes, in which the system collects numerous data, maintains data sets and derives meaning/sense out of them. The information extracted through data science can be used to make a decision about it. Data Sciences analyze the data and helps in making the machine intelligent enough to perform tasks by itself.

Artificial Intelligence is a technology which completely depends on data. It is the data which is fed into the machine which makes it intelligent. And depending upon the type of data we have

Types of Data/Data Formats For Data Science: CSV(comma separated values), Excel Spreadsheet, SQL(Structured Query Language.): XML: (e-Xtensible markup language),JSON(javascript object notation), XLSX: A file is a MS Open XML format spreadsheet

Data Access After collecting the data, to be able to use it for programming purposes, we should know how to access the same in a Python code. To make our lives easier, there exist various Python packages which help us in accessing structured data (in tabular form) inside the code(Numpy,Matplotlib,pandas).