This post will explain us, how to install R on windows environment and how to work with Machine learning project using R with simple dataset
First Download R https://cran.r-project.org/bin/windows/base/ from this link.
You can download latest version of R.
1. Once download completes, then install the same in your machine. This is like any other software installation. There is no special instructions required for this.
2. After successful installation we need to setup the path- go to MyComputer-RightClick- environment variables- System variables-
C:\Program Files\R\R-3.5.1\bin
3. After setting up the path, Now we need to start the R
4. Go to command prompt and Type R
5. Now we can see the simple R terminal
6. Now we will understand what is machine learning? and what is datasets?
7. When we are applying machine learning to our own datasets, we are working on a project.
The process of a machine learning project may not be linear, but there are a number of well-known steps:
Define Problem.
Prepare Data.
Evaluate Algorithms.
Improve Results.
Present Results.
8.The best way to really come in terms with a new platform or tool is to work through a machine learning project end-to-end and cover the key steps.
Namely, from loading data, summarizing your data, evaluating algorithms and making some predictions.
Machine Learning using R Step By Step
Now this the time to work simple machine learning program using R and inbuilt dataset called iris
We already installed R and it has started.
Install any default packages using following syntax.
Packages are third party add-ons or libraries that we can use in R.
install.packages("caret") //While installing the package, after typing the above command , it will ask us for select mirror, you can select default one. install.packages(“caret”,dependencies=c(“Depends”,”Suggests”)) install.packages(“ellipse”) //Load the package, which we are going to use. libray(caret)Load the data from inbuilt data and rename the same using following syntax.
// Attach iris dataset to the current environment data(iris) // Rename iris dataset to dataset dataset <- irisNow iris data loaded in R and accessible with variable called dataset Now we will create validation dataset. We will split the loaded dataset into two, 80% of which we will use to train our models and 20% that we will hold back as a validation dataset.
// We create a list of 80% of the rows in the original dataset we can use for training validation_index <- createDataPartition(dataset$Species, p=0.80, list=FALSE) //select 20% of the data for validation validation <- dataset[-validation_index,] //use the remaining 80% of data to training and testing the models dataset <- dataset[validation_index,]Now we have training data in the dataset variable and a validation set we will use later in the validation variable. Note that we replaced our dataset variable with the 80% sample of the dataset. 1. dim function We can get a quick idea of how many instances (rows) and how many attributes (columns) the data contains with the dim function.
dim(dataset)2. Attribute types - Knowing the types is important as it will give us an idea of how to better summarize the data we have and the types of transforms we might need to use to prepare the data before we model it.
sapply(dataset,class)3. head function used to display the first five rows.
head(dataset)4. The class variable is a factor. A factor is a class that has multiple class labels or levels
levels(dataset$Species)5. Class Distribution Let’s now take a look at the number of instances (rows) that belong to each class. We can view this as an absolute count and as a percentage. 6. Summary of each Attribute
summary(dataset)Visualize Dataset We now have a seen the basic details about the data. We need to extend that with some visualizations. We are going to look at two types of plots: 1. Univariate plots to better understand each attribute. 2. Multivariate plots to better understand the relationships between attributes. First we will see the Univariate plots, this is for each individual variable. Input attributes x and the output attributes y.
//Split input and output x <- dataset[,1:4] y <- dataset[,5]Given that the input variables are numeric, we can create box and whisker plots of each.
par(mfrow=c(1,4)) for(i in 1:4) { boxplot(x[,i], main=names(iris)[i]) }We can also create a barplot of the Species class variable to get a graphical representation of the class distribution (generally uninteresting in this case because they’re even).
plot(y)This confirms what we learned in the last section, that the instances are evenly distributed across the three class: Multivariate Plots First let’s look at scatterplots of all pairs of attributes and color the points by class. In addition, because the scatterplots show that points for each class are generally separate, we can draw ellipses around them.
featurePlot(x=x,y=y,plot=”ellipse”)We can also look at box and whisker plots of each input variable again, but this time broken down into separate plots for each class. This can help to tease out obvious linear separations between the classes.
featurePlot(x=x,y=y,plot=”box”)Next we can get an idea of the distribution of each attribute, again like the box and whisker plots, broken down by class value. Sometimes histograms are good for this, but in this case we will use some probability density plots to give nice smooth lines for each distribution.
// density plots for each attribute by class value scales <- list(x=list(relation="free"), y=list(relation="free")) featurePlot(x=x, y=y, plot="density", scales=scales)Evaluating the Algorithms Set-up the test harness to use 10-fold cross validation. We will split our dataset into 10 parts, train in 9 and test on 1 and release for all combinations of train –test splits. We will also repeat the process 3 times for each algorithm with different splits of the data into 10 groups We are using the metric of “Accuracy” to evaluate models. This is a ratio of the number of correctly predicted instances in divided by the total number of instances in the dataset multiplied by 100 to give a percentage (e.g. 95% accurate). We will be using the metric variable when we run build and evaluate each model next.
control <- tarinControl(method=”csv”,number=10) metric <- “Accuarcy”Build 5 different models to predict species from flower measurements Linear Discriminant Analysis (LDA) Classification and Regression Trees (CART). k-Nearest Neighbors (kNN). Support Vector Machines (SVM) with a linear kernel. Random Forest (RF)
set.seed(7) fit.lda <- train(Species~., data=dataset, method="lda", metric=metric, trControl=control) # b) nonlinear algorithms # CART set.seed(7) fit.cart <- train(Species~., data=dataset, method="rpart", metric=metric, trControl=control) # kNN set.seed(7) fit.knn <- train(Species~., data=dataset, method="knn", metric=metric, trControl=control) # c) advanced algorithms # SVM set.seed(7) fit.svm <- train(Species~., data=dataset, method="svmRadial", metric=metric, trControl=control) # Random Forest set.seed(7) fit.rf <- train(Species~., data=dataset, method="rf", metric=metric, trControl=control)We reset the random number seed before reach run to ensure that the evaluation of each algorithm is performed using exactly the same data splits. It ensures the results are directly comparable. Select the best model. We now have 5 models and accuracy estimations for each. We need to compare the models to each other and select the most accurate. We can report on the accuracy of each model by first creating a list of the created models and using the summary function.
# summarize accuracy of models results <- resamples(list(lda=fit.lda, cart=fit.cart, knn=fit.knn, svm=fit.svm, rf=fit.rf)) summary(results)We can also create a plot of the model evaluation results and compare the spread and the mean accuracy of each model. There is a population of accuracy measures for each algorithm because each algorithm was evaluated 10 times (10 fold cross validation)
dotplot(results)The results can be summarized. This gives a nice summary of what was used to train the model and the mean and standard deviation (SD) accuracy achieved, specifically 97.5% accuracy +/- 4% How to Predictions using predict and confusion Matrix The LDA was the most accurate model. Now we want to get an idea of the accuracy of the model on our validation set. This will give us an independent final check on the accuracy of the best model. It is valuable to keep a validation set just in case you made a slip during such as overfitting to the training set or a data leak. Both will result in an overly optimistic result. We can run the LDA model directly on the validation set and summarize the results in a confusion matrix.
predictions <- predict(fit.lda,validation) confusionMatrix(predictions,validation$Species)
Great Article
ReplyDeleteMachine Learning Projects for Students
Final Year Project Centers in Chennai
Thanks
DeleteAnybody with an expository twisted of psyche can turn into an information researcher after a SAS or SPSS preparing. data science course in pune
ReplyDeleteThanks for sharing this info,it is very helpful.
ReplyDeleteguidewire tutorial
What worth does AI bring and in what manner will it change the job that people play in the workforce? Here are some potential answers:
ReplyDeletemachine learning course
Attend The Machine Learning course in Bangalore From ExcelR. Practical Machine Learning course in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Machine Learning course in Bangalore.
ReplyDeleteMachine Learning course in Bangalore
Well, the most on top staying topic is Data Science. Data science is one of the most promising technique in the growing world. I would like to add Data science training to the preference list. Out of all, Data science course in Mumbai is making a huge difference all across the country. Thank you so much for showing your work and thank you so much for this wonderful article.
ReplyDeleteThank you so much for helping me out to find the Data Analytics Course in Mumbai
ReplyDeleteOrganisations and introducing reputed stalwarts in the industry dealing with data analyzing & assorting it in a structured and precise manner. Keep up the good work. Looking forward to view more from you.
Nice article, which you have shared here about the machine learning. Your article is very interesting and useful for those who are interested to learn machine learning. Thanks for sharing this article here. machine learning summer training in jaipur
ReplyDeleteNice Blog...Very interesting to read this article. I have learn some new information.thanks for sharing.
ReplyDeleteExcelR Mumbai
Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
ReplyDeleteExcelR Data Analytics courses
This post is very simple to read and appreciate without leaving any details out. Great work! data science courses
ReplyDeleteI am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work.
ReplyDeletedata analytics courses
business analytics courses
One stop solution for getting dedicated and transparent Digital Marketing services and We take care of your brands entire digital presence.
ReplyDeleteThe digital marketing services we provide includes SEO, SEM, SMM, online reputation management, local SEO, content marketing, e-mail marketing, conversion rate optimization, website development, pay per click etc. We will definitely promote your brand, product and services at highest position with consistency in order to generate more revenue for your business.Digital Marketing Company
Frux Infotech Proves to be the best solutions for you. We offer the web services,Mobile App development,Android apps,IOS apps, Android developmentSoftware Development,SearchEngineoptimization[SEO], Web promotions,link building. So make Your Business Strong & growth With us.Mobile App Development Services in Vizag
ReplyDeleteFrux Infotech Proves to be the best solutions for you. We offer the web services,Mobile App development,Android apps,IOS apps, Android developmentSoftware Development,SearchEngineoptimization[SEO], Web promotions,link building. So make Your Business Strong & growth With us.Mobile App Development in Vizag
ReplyDeleteGreat post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
ReplyDeletePediatric dentists
After reading your article I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article.
ReplyDeleteKnow more Data Scientist Course
ReplyDeleteNice blog, very informative content.Thanks for sharing, waiting for next update...
Photoshop Classes in Chennai
Photoshop Course in Chennai
Photoshop Training in Chennai
Photoshop Training in OMR
Photoshop Training in Porur
Drupal Training in Chennai
Manual Testing Training in Chennai
LoadRunner Training in Chennai
C C++ Training in Chennai
Nice Blog...Thanks for sharing the article waiting for next update...
ReplyDeleteArtificial Intelligence Course in Chennai
AI Training in chennai
ai classes in chennai
C C++ Course in Chennai
javascript training in chennai
Html5 Training in Chennai
QTP Training in Chennai
DOT NET Training in Chennai
wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
ReplyDeleteData Science Course
First of all thanks for your excellent sample.. That's what I look for exactly. However, I don't know how I can get the checked items by using this adapter?.. share more details. guys.
ReplyDeleteAi & Artificial Intelligence Course in Chennai
PHP Training in Chennai
Ethical Hacking Course in Chennai Blue Prism Training in Chennai
UiPath Training in Chennai
Thanks for your nice post, i am interested to learn online freelancing, but firstly i have to learn computer , could you suggest me please which computer training center best.
ReplyDeleteDot Net Training in Chennai | Dot Net Training in anna nagar | Dot Net Training in omr | Dot Net Training in porur | Dot Net Training in tambaram | Dot Net Training in velachery
ReplyDeleteVery interesting blog Thank you for sharing such a nice and interesting blog and really very helpful article.
Data Science Course in Hyderabad
Very nice blogs!!! i have to learning for lot of information for this sites...Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing, data science online course
ReplyDeletehe scenario has changed and whether or not you are planning a career in the marketing industry, you cannot deny the fact that everyone today has become a digital marketer by posting updates, pictures and videos on Facebook, digital marketing training in hyderabad
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteI would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ReplyDeleteSimple Linear Regression
Correlation vs covariance
data science interview questions
KNN Algorithm
Logistic Regression explained
The information provided on the site is informative. Looking forward more such blogs. Thanks for sharing .
ReplyDeleteData Science Training in Hyderabad
Data Science course in Hyderabad
Data Science coaching in Hyderabad
Data Science Training institute in Hyderabad
Data Science institute in Hyderabad
I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it. data science courses
ReplyDeletevery well explained. I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ReplyDeleteCorrelation vs Covariance
Simple Linear Regression
data science interview questions
KNN Algorithm
Logistic Regression explained
I have express a few of the articles on your website now, and I really like your style of blogging. I added it to my favorite’s blog site list and will be checking back soon…
ReplyDeleteData Scientist Courses I really enjoyed reading this post, big fan. Keep up the good work and please tell me when can you publish more articles or where can I read more on the subject?
I will really appreciate the writer's choice for choosing this excellent article appropriate to my matter.Here is deep description about the article matter which helped me more.
ReplyDeleteData Analyst Course
very informative blog
ReplyDeleteJoin 360digiTMG for best courses
Data science course
Great survey. I'm sure you're getting a great response.
ReplyDeletedata scientist training and placement in hyderabad
Its as if you had a great grasp on the subject matter, but you forgot to include your readers. Perhaps you should think about this from more than one angle. Innosilicon A11 Pro ETHMiner (2000Mh) 8GB
ReplyDeleteThank you for sharing
ReplyDeleteDigital Marketing Courses in Mumbai
I think this is an informative post and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article.
ReplyDeletedata science course in hyderabad
I must say, I thought this was a pretty interesting read when it comes to this topic. Liked the material. . . . . business analytics course in mysore
ReplyDeleteVery informatobe blog!
ReplyDeleteDigital Marketing Courses in Surat
Use them for your favorite video games, similar to roulette, and have the prospect 메리트카지노 to win actual cash. This is why it's value visiting our site a number of} instances to benefit of|benefit from|reap the benefits of} your preferred supply. Their online roulette game is actually a simplified version with just three guess options, paying up to as} 14x per win. You can even take part in “match betting”, which allows gamers to wager throughout tons of of eSports tournaments.
ReplyDeleteThe Digital Marketing Course In Rajouri Garden innovative training programs offer the most comprehensive set of courses spanning the entire product life period. Digi uprise's unique approach blends problems-oriented marketing with a customer-centric approach to design making sure that your product is able to meet the needs of customers.
ReplyDeleteOrigyn IVF is proud and satisfied to have an outstanding success rate and excellence under her supervision. Origyn IVF center is one of the India’s largest IVF center/Test Tube Baby Center in North India having highly advanced equipments & IVF techniques It is an Initiative to originate new life, through the best team effort.
ReplyDeleteThere are many methods to become proficient in Digital Marketing Course in Delhi starting with self-taught classes and obtaining certified programs to enhance your skills. These courses are perfect for entrepreneurs as well as anyone wanting to improve their online presence and get an entry-level job in the field which deals in marketing through digital media.Furthermore, you'll need to be flexible in the way you work with clients. Based on the experience you have, you may be able to charge a higher fee for your services.
ReplyDeleteBaby Joy IVF is the Best IVF Centre in Delhi, well-known for its expertise and efficient results. Baby Joy IVF center is led by a team of dedicated and experienced professionals and IVF experts, ensuring that you receive the best medical advice and service. We specialize in customized minimal stimulation (mini-IVF), natural cycles (natural IVF), and conventional IVF protocols tailored to the needs of each individual. At Baby Joy IVF Clinic in India, we provide world-class fertility treatments by using cutting-edge technology and the experience of leading gynecologists, IVF specialist team, and resourceful physicians to create results-oriented and cost-effective plans to take advantage of ideal outcomes and provide the Highest IVF Success Rate for treatments in a safe and supportive environment.
ReplyDeleteYour work is truly article , consistently delivering content that not only informs but also deeply connects with the audience. Each piece reflects your unwavering commitment to excellence, leaving a lasting impression on every reader. Your creativity, passion, and keen attention to detail are invaluable assets. Please continue upholding this exceptional standard—you are making a significant and meaningful impact!
ReplyDeleteYour work is article —consistently creating content that not only informs but truly resonates. Your creativity, passion, and attention to detail are awe-inspiring, showcasing your dedication to excellence. Every piece leaves a lasting impact. Keep pushing the limits—you’re making a transformative difference!
ReplyDeleteYour work is truly article ! You consistently create content that not only informs but also deeply resonates with readers. Your creativity, passion, and keen attention to detail shine through every piece, showcasing your dedication to excellence. Each piece leaves a lasting impact, and your commitment to pushing the boundaries is truly inspiring. Keep up the incredible work—you’re making a remarkable difference!
ReplyDeleteThanks for this great introduction to machine learning using Java! Your step-by-step approach makes it accessible for beginners. I appreciate the practical examples you included—they really help in understanding the concepts. Looking forward to more content on this topic
ReplyDeleteData Science Courses in Brisbane
This is an excellent and comprehensive guide to getting started with machine learning in R on a Windows environment! You've broken down the process into clear steps, making it accessible for beginners who may not have much experience with R or machine learning.
ReplyDeleteStarting with the installation and setup of R is crucial, and your detailed instructions on changing environment variables and starting R from the command prompt are very helpful. The introduction of machine learning concepts, like the project steps and the explanation of datasets, provides essential context for readers.
The hands-on approach of using the built-in iris dataset for a machine learning project is particularly effective. It’s great to see how you've covered various aspects such as data preparation, visualization, algorithm evaluation, and making predictions with clear code examples. The inclusion of both univariate and multivariate plots aids in understanding the data's structure and relationships, which is vital for effective modeling.
Additionally, your emphasis on cross-validation and model evaluation ensures that readers understand the importance of assessing their models accurately. The confusion matrix at the end is a crucial tool for evaluating model performance on the validation dataset.
Overall, this post serves as a valuable resource for anyone looking to delve into machine learning with R. It encourages hands-on learning, which is often the best way to grasp new concepts! Thank you for sharing such a thorough overview! Data science courses in Gurgaon
This is a fantastic step-by-step guide for anyone new to machine learning with R in a Windows environment. The breakdown of installing R, setting up the environment, and walking through the essential steps of a machine learning project is really clear. Data science courses in Visakhapatnam
ReplyDeleteThis post is an excellent step-by-step guide for beginners diving into machine learning with R. The clarity in explaining each step, especially with the iris dataset example, makes it super helpful. Thanks for simplifying such a complex topic!
ReplyDeleteData science courses in Gujarat