<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>dhruv's space (Posts about machine-learning)</title><link>https://dhruvs.space/</link><description></description><atom:link href="https://dhruvs.space/categories/machine-learning.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2020 &lt;a href="mailto:dhruvt93@gmail.com"&gt;Dhruv Thakur&lt;/a&gt; </copyright><lastBuildDate>Fri, 14 Feb 2020 22:35:39 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Visualizing Optimisation Algorithms</title><link>https://dhruvs.space/posts/visualizing-optimisation-algorithms/</link><dc:creator>Dhruv Thakur</dc:creator><description>&lt;div&gt;&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;p&gt;The &lt;a href="https://www.coursera.org/learn/neural-networks-deep-learning"&gt;first&lt;/a&gt; and &lt;a href="https://www.coursera.org/learn/deep-neural-network/"&gt;second&lt;/a&gt; courses by &lt;a href="https://www.deeplearning.ai/"&gt;deeplearning.ai&lt;/a&gt; offer a great insight into the working of various optimisation algorithms used in Machine Learning. Specifically, they focus on Batch Gradient Descent, Mini-batch Gradient Descent (with and without momentum), and Adam optimisation. Having finished the two courses, I've wanted to go deeper into the world of optimisation. This is probably the first step towards that.&lt;/p&gt;
&lt;p&gt;This notebook/post is an introductory level analysis on the workings of these optimisation approaches. The intent is to visually see these algorithms in action, and hopefully see how they're different from each other.&lt;/p&gt;
&lt;p&gt;The approach below is greatly inspired by &lt;a href="http://louistiao.me/notes/visualizing-and-animating-optimization-algorithms-with-matplotlib/"&gt;this&lt;/a&gt; post by &lt;a href="http://louistiao.me/"&gt;Louis Tiao&lt;/a&gt; on optimisation visualizations, and &lt;a href="http://jakevdp.github.io/blog/2012/08/18/matplotlib-animation-tutorial/"&gt;this&lt;/a&gt; tutorial on matplotlib animation by &lt;a href="http://jakevdp.github.io/pages/about.html"&gt;Jake VanderPlas&lt;/a&gt;.&lt;/p&gt;

&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Table of Contents&lt;span class="tocSkip"&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;div class="toc"&gt;&lt;ul class="toc-item"&gt;&lt;li&gt;&lt;span&gt;&lt;a href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/#Setup" data-toc-modified-id="Setup-1"&gt;Setup&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span&gt;&lt;a href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/#Batch-Gradient-Descent" data-toc-modified-id="Batch-Gradient-Descent-2"&gt;Batch Gradient Descent&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span&gt;&lt;a href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/#Mini-Batch-Gradient-Descent" data-toc-modified-id="Mini-Batch-Gradient-Descent-3"&gt;Mini-Batch Gradient Descent&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span&gt;&lt;a href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/#Mini-Batch-Gradient-Descent-with-Momentum" data-toc-modified-id="Mini-Batch-Gradient-Descent-with-Momentum-4"&gt;Mini-Batch Gradient Descent with Momentum&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span&gt;&lt;a href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/#Adam-optimisation" data-toc-modified-id="Adam-optimisation-5"&gt;Adam optimisation&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span&gt;&lt;a href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/#References" data-toc-modified-id="References-6"&gt;References&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;hr&gt;

&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;h3 id="Setup"&gt;Setup&lt;a class="anchor-link" href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/#Setup"&gt;¶&lt;/a&gt;&lt;/h3&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="cell border-box-sizing code_cell rendered"&gt;
&lt;div class="input"&gt;
&lt;div class="prompt input_prompt"&gt;In [1]:&lt;/div&gt;
&lt;div class="inner_cell"&gt;
    &lt;div class="input_area"&gt;
&lt;div class=" highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# imports&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;mpl_toolkits.mplot3d&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Axes3D&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.colors&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LogNorm&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;matplotlib&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="n"&gt;animation&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;IPython.display&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HTML&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;math&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;itertools&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="n"&gt;zip_longest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.datasets&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="n"&gt;make_classification&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="cell border-box-sizing code_cell rendered"&gt;
&lt;div class="input"&gt;
&lt;div class="prompt input_prompt"&gt;In [58]:&lt;/div&gt;
&lt;div class="inner_cell"&gt;
    &lt;div class="input_area"&gt;
&lt;div class=" highlight hl-ipython3"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="k"&gt;matplotlib&lt;/span&gt; inline
&lt;/pre&gt;&lt;/div&gt;

    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="cell border-box-sizing text_cell rendered"&gt;&lt;div class="prompt input_prompt"&gt;
&lt;/div&gt;&lt;div class="inner_cell"&gt;
&lt;div class="text_cell_render border-box-sizing rendered_html"&gt;
&lt;p&gt;Alright. We need something to optimize. To begin with, let's try to find the minima(s) of &lt;a href="https://en.wikipedia.org/wiki/Himmelblau%27s_function"&gt;Himmelblau function&lt;/a&gt; , which is represented as:&lt;/p&gt;
$$f(x,y)=(x^{2}+y-11)^{2}+(x+y^{2}-7)^{2}$$&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;a href="https://dhruvs.space/posts/visualizing-optimisation-algorithms/"&gt;Read more…&lt;/a&gt; (16 min remaining to read)&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</description><category>adam-optimiser</category><category>gradient-descent</category><category>machine-learning</category><category>optimisation</category><guid>https://dhruvs.space/posts/visualizing-optimisation-algorithms/</guid><pubDate>Mon, 17 Sep 2018 18:17:21 GMT</pubDate></item><item><title>Summary Notes: Forward and Back Propagation</title><link>https://dhruvs.space/posts/understanding-forward-and-backpropagation/</link><dc:creator>Dhruv Thakur</dc:creator><description>&lt;div&gt;&lt;p&gt;I recently completed the &lt;a href="https://www.coursera.org/learn/neural-networks-deep-learning"&gt;first&lt;/a&gt; course offered by &lt;a href="https://www.deeplearning.ai/"&gt;deeplearning.ai&lt;/a&gt;, and found it incredibly educational. Going forwards, I want to keep a summary of the stuff I learn (for my future reference) in the form of notes like this. This one is for forward and back-prop intuitions.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://dhruvs.space/posts/understanding-forward-and-backpropagation/"&gt;Read more…&lt;/a&gt; (3 min remaining to read)&lt;/p&gt;&lt;/div&gt;</description><category>deep-learning</category><category>machine-learning</category><category>neural-nets</category><category>summary-notes</category><guid>https://dhruvs.space/posts/understanding-forward-and-backpropagation/</guid><pubDate>Mon, 10 Sep 2018 10:55:00 GMT</pubDate></item></channel></rss>