{"id":454,"date":"2019-08-07T16:33:55","date_gmt":"2019-08-07T21:33:55","guid":{"rendered":"http:\/\/fowlercs.com\/wp\/?p=454"},"modified":"2019-08-08T13:24:29","modified_gmt":"2019-08-08T18:24:29","slug":"machine-learning-supervised-and-unsupervised","status":"publish","type":"post","link":"http:\/\/fowlercs.com\/wp\/machine-learning-supervised-and-unsupervised\/","title":{"rendered":"Machine Learning: Supervised and Unsupervised"},"content":{"rendered":"<div class=\"reader-article-content\" dir=\"ltr\">\n<p>Supervised typically takes the form of classification or regression. We know the input and output variables, and try to make sense of the relationships between the two. Tembhurkar, Tugnayat, &amp; Nagdive (2014) refer to this as Descriptive mining. Common methods include decision tree, kNN algorithm, regression, and discriminant analysis. The methods are dependent upon the type of data input: continuous variables will use regression methods, while discrete variables will use classification methods.<\/p>\n<p>For example, a human resources division in a large multinational company wants to determine what factors have contributed to employee attrition over the past two years. A decision tree methodology can produce a simple \u201cif-then\u201d map of what attributes combine and result in a separated employee. An example tree might point out that a male employee over the age of 45, working in Division X, who commutes more than 25 miles from home, has a manager 10 years or more his junior, and has been in the same unit for more than seven years is a prime candidate for attrition. Although many of the variables are continuous, a decision tree method makes the data manageable and actionable for human resources division use.<\/p>\n<p>Unsupervised are usually clustering or association. The output variables are not known, and we are relying on the system to make sense of the data. No <em>a priori<\/em> knowledge. Temburkhar et al refers to this as Prescriptive mining. Common methods include neural networks, anomaly detection, k-means clustering, and principal components analysis. The methods are dependent upon the type of data input: continuous variables will use association methods, while discrete variables will use clustering methods.<\/p>\n<p>For example, a multi-level marketing company has a number of data points on its associates: units sold, associates recruited, years in the program, rewards program tier, et cetera. They know the associates can be grouped into performance categories akin to <em>novice<\/em> and <em>expert<\/em> but are unclear on both how many categories to look at and what factors are important. Principal components analysis and k-means clustering can reveal how the associates differentiate themselves based on the available variables and suggest an appropriate number of categories within which to classify them.<\/p>\n<p>References<\/p>\n<p>Brownlee, J. (2016, September 22). Supervised and unsupervised machine learning algorithms.\u00a0\u00a0Retrieved from <a href=\"https:\/\/machinelearningmastery.com\/supervised-and-unsupervised-machine-learning-algorithms\/\" target=\"_blank\" rel=\"nofollow noopener\">https:\/\/machinelearningmastery.com\/supervised-and-unsupervised-machine-learning-algorithms\/<\/a><\/p>\n<p>Soni, D. (2018, March 22). Supervised vs. Unsupervised learning \u2013 towards data science.\u00a0\u00a0Retrieved from <a href=\"https:\/\/towardsdatascience.com\/supervised-vs-unsupervised-learning-14f68e32ea8d\" target=\"_blank\" rel=\"nofollow noopener\">https:\/\/towardsdatascience.com\/supervised-vs-unsupervised-learning-14f68e32ea8d<\/a><\/p>\n<p>Tembhurkar, M. P., Tugnayat, R. M., &amp; Nagdive, A. S. (2014). Overview on data mining schemes to design business intelligence framework for mobile technology. <em>International Journal of Advanced Research in Computer Science, 5<\/em>(8).<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Supervised typically takes the form of classification or regression. We know the input and output variables, and try to make sense of the relationships between the two. Tembhurkar, Tugnayat, &amp; Nagdive (2014) refer to this as Descriptive mining. Common methods include decision tree, kNN algorithm, regression, and discriminant analysis. The methods are dependent upon the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"footnotes":""},"categories":[98],"tags":[155,152,153,154],"_links":{"self":[{"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/posts\/454"}],"collection":[{"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/comments?post=454"}],"version-history":[{"count":2,"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/posts\/454\/revisions"}],"predecessor-version":[{"id":512,"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/posts\/454\/revisions\/512"}],"wp:attachment":[{"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/media?parent=454"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/categories?post=454"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/fowlercs.com\/wp\/wp-json\/wp\/v2\/tags?post=454"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}