{"id":6,"date":"2014-10-03T20:00:31","date_gmt":"2014-10-03T20:00:31","guid":{"rendered":"http:\/\/www.hort.net\/techblog\/?p=6"},"modified":"2014-11-21T19:53:08","modified_gmt":"2014-11-21T19:53:08","slug":"upgrading-hort-net-to-redis","status":"publish","type":"post","link":"https:\/\/www.hort.net\/techblog\/2014\/10\/03\/upgrading-hort-net-to-redis\/","title":{"rendered":"Upgrading hort.net to Redis, part 1"},"content":{"rendered":"<div class=\"share_buttons_simple_use_buttons\" style=\"padding: 10px 0; display: inline-block\"><div class=\"tweet_button\" style=\"float: left; vertical-align: top\"><a href=\"https:\/\/twitter.com\/share\" class=\"twitter-share-button\" data-url=\"https:\/\/www.hort.net\/techblog\/2014\/10\/03\/upgrading-hort-net-to-redis\/\" data-text=\"Upgrading hort.net to Redis, part 1\" data-count=\"none\">Tweet<\/a><script type=\"text\/javascript\" src=\"https:\/\/platform.twitter.com\/widgets.js\"><\/script><\/div><div class=\"facebook_like_button\" style=\"float: left; vertical-align: top; margin-left: 10px; max-width: 255px\"><iframe src=\"https:\/\/www.facebook.com\/plugins\/like.php?href=https%3A%2F%2Fwww.hort.net%2Ftechblog%2F2014%2F10%2F03%2Fupgrading-hort-net-to-redis%2F&amp;layout=button_count&amp;show_faces=false&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=25\" scrolling=\"no\" frameborder=\"0\" style=\"border:none; overflow:hidden; width:450px; height:25px;\" allowTransparency=\"true\"><\/iframe><\/div><\/div><p>We\u2019ve been playing with Redis at hort.net over the past year. Its speed and simplicity make system performance better and programming easier \u2014 what used to be 30 lines of MySQL statements can be condensed into five Redis queries that shave 1\/10th of a second off page loading times. And that\u2019s just for one part of the site.  Redis also scales to multiple servers more easily, so as hort.net grows it won&#8217;t be painful to keep up technically.<\/p>\n<p>After successfully moving our <a title=\"sHORTurl URL shortening service\" href=\"http:\/\/hort.li\/\" target=\"_blank\">sHORTurl<\/a>\u00a0service to Redis (and getting the aforementioned performance gains), we started looking for other ways to use it here.<\/p>\n<p>hort.net is entirely written in custom Perl (through mod_perl) under Apache with a MySQL backend. \u00a0Redis would eliminate the latency in database queries, but because Redis is stored in memory we only wanted to use it in highly repetitive cases that didn\u2019t involve complicated MySQL statements.<\/p>\n<h3>Identifying MySQL queries that could shift to Redis<\/h3>\n<p>I\u00a0restarted MySQL with the \u2013log option so that every database command would be logged in a file. \u00a0After about 21 hours, a short\u00a0shell command was run\u00a0to identify the most frequently queried databases:<\/p>\n<div class=\"horttech-code\"><pre class=\"preserve-code-formatting\">&nbsp;&nbsp;&nbsp;&nbsp;\n$ grep SELECT \/var\/log\/mysql\/query-log | awk -F&#039;FROM&#039; &#039;{print $2}&#039; | awk &#039;{print $1}&#039; | sort | uniq -c | sort -n | tail -10\n1101 authentication\n1294 account.email\n2579 splits\n4025 image\n4383 plant\n7737 vote\n15248 wp_options\n26336 topstory\n29944\n50506 navigation\n<\/pre><\/div>\n<p>The number on the left shows how many queries the database named on the right received during that timeframe.<\/p>\n<p>navigation and topstory are both displayed on every hort.net page, and navigation is actually used in multiple places on each page. wp_options is used by WordPress, and vote, plant, and image are used when images in the plant gallery are viewed (or voted upon). image and plant are fairly large databases that involve some complicated database queries so they wouldn\u2019t make sense to move to Redis. But these three databases looked like great candidates:<\/p>\n<ul>\n<li>navigation<\/li>\n<li>topstory<\/li>\n<li>vote<\/li>\n<\/ul>\n<p>topstory was very straightforward. \u00a0topstory just contained a story_id, headline, url, date, and a bit of text in a database, and the most common query retrieved the most recent horticultural news articles posted on hort.net. \u00a0vote is more complicated because it uses the AVG() and STDDEV() MySQL functions to display higher-voting images first, and navigation was moderately complex because it used a weighting system to sort navigation links. Atlhough topstory didn\u2019t give us the largest database hit, it seemed like the easiest to migrate. \u00a0So we started there.<\/p>\n<h3>Creating\u00a0the Redis schema<\/h3>\n<p>First I had to develop a schema. The old MySQL database had this schema and these queries:<\/p>\n<div class=\"horttech-code\"><pre class=\"preserve-code-formatting\">+-------------+------------------+------+-----+---------+----------------+\n| Field&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | Type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | Null | Key | Default | Extra&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|\n+-------------+------------------+------+-----+---------+----------------+\n| story_id&nbsp;&nbsp;&nbsp;&nbsp;| int(10) unsigned | NO&nbsp;&nbsp; | MUL | NULL&nbsp;&nbsp;&nbsp;&nbsp;| auto_increment |\n| headline&nbsp;&nbsp;&nbsp;&nbsp;| varchar(255)&nbsp;&nbsp;&nbsp;&nbsp; | NO&nbsp;&nbsp; | MUL |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|\n| url&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | varchar(255)&nbsp;&nbsp;&nbsp;&nbsp; | NO&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|\n| description | text&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | YES&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp; | NULL&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|\n| date&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;| datetime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | NO&nbsp;&nbsp; | MUL | NULL&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|\n+-------------+------------------+------+-----+---------+----------------+\n5 rows in set (0.00 sec)\n\n# get last 8 headlines and URLs, displayed on every page\nSELECT SQL_CACHE url, headline FROM topstory ORDER BY date DESC LIMIT 8\n# display last six hours of headlines in RSS\nSELECT SQL_CACHE DATE_FORMAT(DATE_SUB(date, INTERVAL 6 HOUR), \u201c%a, %d %b %Y %T GMT\u201d) FROM topstory WHERE date &lt;= NOW() ORDER BY DATE DESC LIMIT 1\nSELECT SQL_CACHE story_id, headline, url, description, DATE_FORMAT(DATE_SUB(date, INTERVAL 6 HOUR), \u201c%a, %d %b %Y %T GMT\u201d) FROM topstory WHERE date &lt;= NOW() ORDER BY DATE DESC\n# display all stories on the topstories page\nSELECT SQL_CACHE headline, url, date FROM topstory WHERE date &lt;= NOW() ORDER BY date DESC\n<\/pre><\/div>\n<p>Since speed was a primary goal, I had to make sure that our Redis datastore allowed us to easily sort our stories by date since every query relied on that. I ended up with this:<\/p>\n<div class=\"horttech-code\"><pre class=\"preserve-code-formatting\">Hash s:h [story_id] [headline]\nHash s:u [story_id] [url]\nHash s:d [story_id] [description]\nHash s:e [story_id] [epoch time]\nSortedSet s:t [epoch time] [story_id]\nSet s:id [story_id]\n<\/pre><\/div>\n<p>The SortedSet uses the time in seconds since 1970 as our sorting field; with the <a title=\"Redis ZREVRANGEBYSCORE command\" href=\"http:\/\/redis.io\/commands\/zrevrangebyscore\" target=\"_blank\">ZREVRANGEBYSCORE<\/a>\u00a0command we can tell Redis to sort our ids by date and only return the last eight, if desired. \u00a0Once the list of ids is returned the script will use them to do a <a title=\"Redis HMGET command\" href=\"http:\/\/redis.io\/commands\/hmget\" target=\"_blank\">HMGET<\/a> for the url, headline, and if necessary, description.<\/p>\n<p>Before going any further, I wanted to time hort.net page loads before that part of the site was switched to Redis. I threw together a quick script that did 12 page loads five seconds apart, then averaged the load time. This should take other server activity into account, and it should also leverage MySQL caching to give us a true apples-to-apples comparison.  The script was actually run regularly throughout the day and the values below represent the averages.<\/p>\n<div class=\"ezcol ezcol-one-half\">\n<div class=\"horttech-code\"><pre class=\"preserve-code-formatting\">#!\/bin\/sh\n# hort.net front page load timing script\n\ntotal=0\nfor i in {1..12}\ndo\n&nbsp;&nbsp; seconds=`(time lynx -dump -source http:\/\/www.hort.net\/ &gt; \/dev\/null ) |&amp; grep real | sed -r &#039;s\/s$|0m\/\/g&#039; | awk &#039;{print $NF}&#039;`\n&nbsp;&nbsp; echo $seconds&quot; seconds&quot;\n&nbsp;&nbsp; sleep 5s\n&nbsp;&nbsp; total=$(echo $total + $seconds | bc) \ndone\n\necho -n &quot;Average load time: &quot;; echo $total \/ 12 | bc -l\n<\/pre><\/div>\n<\/div><div class=\"ezcol ezcol-one-half ezcol-last\">\n<div class=\"horttech-code\"><pre class=\"preserve-code-formatting\">$ .\/timeit\n0.183 seconds\n0.195 seconds\n0.198 seconds\n0.202 seconds\n0.162 seconds\n0.172 seconds\n0.188 seconds\n0.185 seconds\n0.189 seconds\n0.193 seconds\n0.184 seconds\n0.175 seconds\nAverage load time: .18550000000000000000\n<\/pre><\/div>\n<\/div><div class=\"ezcol-divider\"><\/div><br \/>\nAs you can see, the front page loaded in .1855 seconds on average.<\/p>\n<p>Next I wrote a script to copy the topstory database into Redis. \u00a0It\u2019s a script intended for onetime use, and since stories are manually added\u00a0I didn\u2019t have to worry about losing anything while migrating. It was fairly straightforward, and once it was run it took .979 seconds to move the 615 records into Redis.<\/p>\n<p>Now came the hardest part \u2014\u00a0converting scripts running on the site to use Redis instead. The queries had to be rewritten, although that actually wasn&#8217;t as horrible as expected. The simple query that displayed the eight most recent stories was replaced with three Redis queries from Perl:<\/p>\n<div class=\"horttech-code\"><pre class=\"preserve-code-formatting\">\n# PLACE ids OF 8 MOST RECENT STORIES INTO AN ARRAY CALLED @ids\nmy @ids = $GLOBAL::RedisHandle-&gt;zrevrangebyscore(&#039;s:t&#039;, &#039;+INF&#039;, &#039;-INF&#039;, &#039;LIMIT&#039;, 0, 8);\n# GRAB THE HEADLINES FOR THOSE 8 ids\nmy @headlines = $GLOBAL::RedisHandle-&gt;hmget(&#039;s:h&#039;, @ids);\n# GRAB THE URLS FOR THOSE 8 ids\nmy @urls = $GLOBAL::RedisHandle-&gt;hmget(&#039;s:u&#039;, @ids);\n<\/pre><\/div>\n<h3>How did it perform?<\/h3>\n<p><div class=\"ezcol ezcol-one-half\">\nSo how did the timing go? It shaved off about 1\/100th of a second, so the gains were minimal. I suspect that once the navbar is changed we&#8217;ll see improved gains like we did for sHORTurl because MySQL will no longer be used for the design framework, so the main Web page and things like the mailing list archives won&#8217;t have MySQL overhead anywhere. Pages like the gallery won&#8217;t see any real improvement because they still rely on MySQL for other things that aren&#8217;t appropriate for Redis.<\/p>\n<p>We&#8217;ll tackle the navbar in Part 2 and see how much of a difference <em><strong>that<\/strong><\/em> makes.<br \/>\n<\/div><div class=\"ezcol ezcol-one-half ezcol-last\">\n<div class=\"horttech-code\"><pre class=\"preserve-code-formatting\">$ .\/timeit\n0.178 seconds\n0.168 seconds\n0.162 seconds\n0.175 seconds\n0.175 seconds\n0.191 seconds\n0.174 seconds\n0.167 seconds\n0.169 seconds\n0.167 seconds\n0.179 seconds\n0.177 seconds\nAverage load time: .17350000000000000000\n\n<\/pre><\/div>\n<\/div><div class=\"ezcol-divider\"><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"share_buttons_simple_use_buttons\" style=\"padding: 10px 0; display: inline-block\"><div class=\"tweet_button\" style=\"float: left; vertical-align: top\"><a href=\"https:\/\/twitter.com\/share\" class=\"twitter-share-button\" data-url=\"https:\/\/www.hort.net\/techblog\/2014\/10\/03\/upgrading-hort-net-to-redis\/\" data-text=\"Upgrading hort.net to Redis, part 1\" data-count=\"none\">Tweet<\/a><script type=\"text\/javascript\" src=\"https:\/\/platform.twitter.com\/widgets.js\"><\/script><\/div><div class=\"facebook_like_button\" style=\"float: left; vertical-align: top; margin-left: 10px; max-width: 255px\"><iframe src=\"https:\/\/www.facebook.com\/plugins\/like.php?href=https%3A%2F%2Fwww.hort.net%2Ftechblog%2F2014%2F10%2F03%2Fupgrading-hort-net-to-redis%2F&amp;layout=button_count&amp;show_faces=false&amp;width=450&amp;action=like&amp;colorscheme=light&amp;height=25\" scrolling=\"no\" frameborder=\"0\" style=\"border:none; overflow:hidden; width:450px; height:25px;\" allowTransparency=\"true\"><\/iframe><\/div><\/div><p>TweetWe\u2019ve been playing with Redis at hort.net over the past year. Its speed and simplicity make system performance better and programming easier \u2014 what used to be 30 lines of MySQL statements can be condensed into five Redis queries that shave 1\/10th of a second off page loading times. And that\u2019s just for one part [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[4,3,2],"class_list":["post-6","post","type-post","status-publish","format-standard","hentry","category-databases","tag-database","tag-mysql","tag-redis"],"_links":{"self":[{"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/posts\/6","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/comments?post=6"}],"version-history":[{"count":60,"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/posts\/6\/revisions"}],"predecessor-version":[{"id":160,"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/posts\/6\/revisions\/160"}],"wp:attachment":[{"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/media?parent=6"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/categories?post=6"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hort.net\/techblog\/wp-json\/wp\/v2\/tags?post=6"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}