{"id":130,"date":"2022-06-19T18:20:00","date_gmt":"2022-06-19T18:20:00","guid":{"rendered":"http:\/\/www.soumyanath.in\/wp\/2022\/06\/19\/geo-spatial-analytics-with-r\/"},"modified":"2022-06-18T18:21:03","modified_gmt":"2022-06-18T18:21:03","slug":"geo-spatial-analytics-with-r","status":"publish","type":"post","link":"https:\/\/www.soumyanath.in\/wp\/2022\/06\/19\/geo-spatial-analytics-with-r\/","title":{"rendered":"Geo-Spatial Analytics with R"},"content":{"rendered":"<div dir=\"ltr\" style=\"text-align: left;\">I got couple of questions about what is R. Well, R<span style=\"background-color: white;\">&nbsp;is&nbsp;freely available language and environment for statistical computing and graphics. R provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering. R is a free software. Here the word free is used in the same sense as freedom. R can be developed, modified and distributed freely, provided one abides by the GNU General Public License. You may&nbsp;<\/span><span style=\"background-color: white;\">consult the&nbsp;<\/span><a href=\"https:\/\/www.r-project.org\/\" style=\"background: white; color: purple;\" target=\"_top\" rel=\"noopener noreferrer\">R project homepage<\/a><span style=\"background-color: white;\">&nbsp;for further information.<\/span><\/p>\n<p>Being Open source and free, lot of interesting development has happened in R, that help us to perform data analysis quickly. R has many excellent library that can put most costly commercial products to shame. R is powerful, though not very easy to learn. There are too many choices in R and one need to devote considerable time to understand what works best for a particular situation.<\/p>\n<p>In this blog, we will see how to read data from a RDBMS &#8211; MySQL. Perform some amount of data manipulation to generate reports. We will also present data in Geo-spatial reports in addition to &nbsp;standard charts.<\/p>\n<p>For our tutorial, we use month-wise sales data at dealer level. We also have address of dealers in our database that we will leverage to generate sales data at district level. We have this data in a relational database &#8211; MySQL. R has library to extract data from most standard data bases. Here to access this data we need to first connect R with the database. This require RMySQL package, which is currently not installed in my machine. I need to get this package. So I give a command in R:<\/p>\n<p><span style=\"background-color: white; color: blue; font-family: monospace; line-height: 15px; white-space: pre;\">install.packages(&#8220;RMySQL&#8221;)<\/span><br \/>\n<span style=\"background-color: white; color: #888888; font-family: monospace; line-height: 15px; white-space: pre;\"><br \/>\n<\/span>It gives me a list of server to choose form. I select the closest server in India, about a minute later I get confirmation that the package is downloaded but it has failed with an error. A quick look in the terminal window tells me that it could not find mysqlclient-dev library. This error is kind of expected in a Linux system. Unlike windows, linux machines compile the code in local machine. I have not installed mysql client programs in my machine, all I have is a server. Naturally RMySQL installation fails. Error message gives me the way out is to install libmysqlclient-dev or &nbsp;libmariadbclient-dev. I install this library with in Ubuntu:<\/p>\n<p><span style=\"color: blue; font-family: &quot;courier new&quot; , &quot;courier&quot; , monospace;\">sudo apt-get install libmysqlclient-dev<\/span><\/p>\n<p>and then again try installing RMySQL. This time installation succeeds without any problem.<\/p>\n<p><span style=\"color: red;\"><b>Note:<\/b> <i>There are some alternate remedy suggested like installing r-client-dev of libmariadbclient-dev. These solutions do not work, at least in my Ubuntu 14.10 machine. So stick to the suggestion pointed in error message.<\/i><\/span><\/p>\n<div><\/div>\n<div style=\"text-align: left;\">To connect to MySQL one needs userid and password. While these can be given in script, it is actually a foolish idea to do so. It is quite easy to share program script with someone with userid and password, I would never take such chance and prefer to use a configuration file to keep the connection recipe secure. The file where these information is kept should be named <span style=\"color: blue;\">.my.cnf <\/span>and placed in the home directory. Here is how you set up .my.cnf file (or &#8216;c:\/my.cnf&#8217; under Windows)<\/div>\n<div><\/div>\n<div>\n<div># this is a comment<\/div>\n<div>[client]<\/div>\n<div>user = dj<\/div>\n<div>host = localhost<\/div>\n<div><\/div>\n<div>[rs-dbi]<\/div>\n<div>database = s-data<\/div>\n<div><\/div>\n<div>[lasers]<\/div>\n<div>user = opto<\/div>\n<div>database = opto<\/div>\n<div>password = pure-light<\/div>\n<div>host = merced<\/div>\n<div>&#8230;<\/div>\n<div>[iptraffic]<\/div>\n<div>host = data<\/div>\n<div>database = iptraffic<\/div>\n<div><\/div>\n<div>This file should be readeable only by you. Inside each section, MySQL parameters may be specified one per line (e.g., user = opto). MySQL always considers default options from the [client] group for connecting to a server. To override or add additional options, R MySQL combines default options from the [rs-dbi] group, but you may specifiy you own group in the dbConnect call to tailor your environment. Note that to override options, you must place your group after the [client] group in configuration file.<\/div>\n<div><\/div>\n<div>For instance, if you define a group, say, [iptraffic], then instead of including all these parameters in the call to dbConnect, you simply supply the name of the group, e.g., dbConnect(mgr, group = &#8220;iptraffic&#8221;).<\/div>\n<\/div>\n<div><\/div>\n<div>\n<pre><code><span style=\"color: #007020; font-weight: bold;\">library<\/span>(DBI)\n<span style=\"color: #60a0b0; font-style: italic;\"># Connect to my-db as defined in ~\/.my.cnf<\/span>\ncon &lt;-<span style=\"color: #4070a0;\"> <\/span><span style=\"color: #007020; font-weight: bold;\">dbConnect<\/span>(RMySQL::<span style=\"color: #007020; font-weight: bold;\">MySQL<\/span>(), <span style=\"color: #902000;\">group =<\/span> <span style=\"color: #4070a0;\">\"my-db\"<\/span>)\n\n<span style=\"color: #007020; font-weight: bold;\">dbListTables<\/span>(con)\n<span style=\"color: #007020; font-weight: bold;\">dbWriteTable<\/span>(con, <span style=\"color: #4070a0;\">\"mtcars\"<\/span>, mtcars)\n<span style=\"color: #007020; font-weight: bold;\">dbListTables<\/span>(con)\n\n<span style=\"color: #007020; font-weight: bold;\">dbListFields<\/span>(con, <span style=\"color: #4070a0;\">\"mtcars\"<\/span>)\n<span style=\"color: #007020; font-weight: bold;\">dbReadTable<\/span>(con, <span style=\"color: #4070a0;\">\"mtcars\"<\/span>)\n\n<span style=\"color: #60a0b0; font-style: italic;\"># You can fetch all results:<\/span>\nres &lt;-<span style=\"color: #4070a0;\"> <\/span><span style=\"color: #007020; font-weight: bold;\">dbSendQuery<\/span>(con, <span style=\"color: #4070a0;\">\"SELECT * FROM mtcars WHERE cyl = 4\"<\/span>)\n<span style=\"color: #007020; font-weight: bold;\">dbFetch<\/span>(res)\n<span style=\"color: #007020; font-weight: bold;\">dbClearResult<\/span>(res)\n\n<span style=\"color: #60a0b0; font-style: italic;\"># Or a chunk at a time<\/span>\nres &lt;-<span style=\"color: #4070a0;\"> <\/span><span style=\"color: #007020; font-weight: bold;\">dbSendQuery<\/span>(con, <span style=\"color: #4070a0;\">\"SELECT * FROM mtcars WHERE cyl = 4\"<\/span>)\nwhile(!<span style=\"color: #007020; font-weight: bold;\">dbHasCompleted<\/span>(res)){\n  chunk &lt;-<span style=\"color: #4070a0;\"> <\/span><span style=\"color: #007020; font-weight: bold;\">dbFetch<\/span>(res, <span style=\"color: #902000;\">n =<\/span> <span style=\"color: #40a070;\">5<\/span>)\n  <span style=\"color: #007020; font-weight: bold;\">print<\/span>(<span style=\"color: #007020; font-weight: bold;\">nrow<\/span>(chunk))\n}\n<span style=\"color: #60a0b0; font-style: italic;\"># Clear the result<\/span>\n<span style=\"color: #007020; font-weight: bold;\">dbClearResult<\/span>(res)\n\n<span style=\"color: #60a0b0; font-style: italic;\"># Disconnect from the database<\/span>\n<span style=\"color: #007020; font-weight: bold;\">dbDisconnect<\/span>(con)<\/code><\/pre>\n<h2 style=\"background: rgb(255, 255, 255); color: #666666; font-family: monospace; font-size: large;\">MySQL configuration file<\/h2>\n<p>Instead of specifying a username and password in calls to&nbsp;<code style=\"white-space: pre;\">dbConnect()<\/code>, it&#8217;s better to set up a MySQL configuration file that names the databases that you connect to most commonly. This file should live in<code style=\"white-space: pre;\">~\/.my.cnf<\/code>&nbsp;and look like:<\/p>\n<pre><code>[database_name]\noption1=value1\noption2=value2<\/code><\/pre>\n<p>If you want to run the examples, you&#8217;ll need to set the proper options in the&nbsp;<code style=\"white-space: pre;\">[rs-dbi]<\/code>&nbsp;group of any MySQL option file, such as \/etc\/my.cnf or the .my.cnf file in your home directory. For a default single user install of MySQL, the following code should work:<\/p>\n<pre><code>[rs-dbi]\ndatabase=test\nuser=root\npassword=<\/code><\/pre>\n<\/div>\n<div><code><br \/>\n<\/code><\/div>\n<div><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I got couple of questions about what is R. Well, R&nbsp;is&nbsp;freely available language and environment for statistical computing and graphics. R provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, &hellip; <a href=\"https:\/\/www.soumyanath.in\/wp\/2022\/06\/19\/geo-spatial-analytics-with-r\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[31,34],"tags":[],"class_list":["post-130","post","type-post","status-publish","format-standard","hentry","category-analytics","category-spatial-data"],"_links":{"self":[{"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/posts\/130","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/comments?post=130"}],"version-history":[{"count":2,"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/posts\/130\/revisions"}],"predecessor-version":[{"id":165,"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/posts\/130\/revisions\/165"}],"wp:attachment":[{"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/media?parent=130"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/categories?post=130"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.soumyanath.in\/wp\/wp-json\/wp\/v2\/tags?post=130"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}