<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><atom:link rel="hub" href="http://tumblr.superfeedr.com/" xmlns:atom="http://www.w3.org/2005/Atom"/><description>CEO at Minjar. Previously CTO at Kuliza, GizaPage and worked for Trilogy.</description><title>Untitled Technopreneur</title><generator>Tumblr (3.0; @amnigos)</generator><link>http://amnigos.com/</link><item><title>justmigrate:

Hi,
I just moved my posts from Posterous! Do go...</title><description>&lt;img src="http://25.media.tumblr.com/00a404b338902015e06a786c391882f3/tumblr_mib6hourGt1s63aolo1_500.png"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;&lt;a href="http://justmigrate.tumblr.com/post/43217428437/hi-i-just-moved-my-posts-from-posterous-do-go" class="tumblr_blog"&gt;justmigrate&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Hi,&lt;/p&gt;
&lt;p&gt;I just moved my posts from Posterous! Do go though my blog for all the new posts.&lt;/p&gt;
&lt;p&gt;Its easy to migrate try &lt;a href="http://justmigrate.com"&gt;JustMigrate&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://3crumbs.com/app"&gt;3Crumbs app - Are you the local thrifter we all have been looking for? &lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;</description><link>http://amnigos.com/post/44631345172</link><guid>http://amnigos.com/post/44631345172</guid><pubDate>Tue, 05 Mar 2013 21:15:49 +0530</pubDate></item><item><title> Hardware Heterogeneity in AWS EC2 and Impact on Instance Performance</title><description>&lt;p&gt;If you have been using AWS EC2 for long enough time then would have noticed certain instances being slow compared to other instances of the same family type (like m1.large or m1.xlarge). Now, there is an &lt;a href="https://www.usenix.org/system/files/conference/hotcloud12/hotcloud12-final40.pdf"&gt;interesting study presented at Usenix HotCloud12&lt;/a&gt; confirms that &lt;strong&gt;underlying hardware heterogeneity in AWS EC2 indeed has an impact on the performance of Instance from CPU, Memory and Disk perspective&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This benchmarking has used UnixBench to measure the CPU, Redis to measure the memory,and Dbench to measure the disk subsystems of similar instance families in US-EAST region across different availability zones and confirms that you can gain from 30% to 60% improvement in system performance based on the actual underlying hardware on which it is running. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pro and lazy tip&amp;#160;:&lt;/strong&gt; Launch instances in new availability zones of your AWS region to exploit the probablistic chance of running your instance on a better hardware :)&lt;/p&gt;</description><link>http://amnigos.com/post/44631348134</link><guid>http://amnigos.com/post/44631348134</guid><pubDate>Thu, 25 Oct 2012 16:33:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>cloud-performance</category><category>cloudcomputing</category></item><item><title>BarcampBangalore Talk : Scalable Load Testing using JMeter in AWS Cloud</title><description>&lt;p&gt;I did a presentation at &lt;a href="http://barcampbangalore.org/bcb/"&gt;BarcampBangalore&lt;/a&gt; Techlash session on &amp;#8220;Scalable Load Testing using JMeter in AWS Cloud&amp;#8221;. The idea was to showcase, how to leverage JMeter and Amazon EC2 to generate massive load tests at cheap cost without having to worry about complexity of setups or configurations.&lt;/p&gt;
&lt;p&gt;I have used &lt;a href="https://github.com/amnigos/jmeter-ec2"&gt;jmeter-ec2&lt;/a&gt; script as a base for this talk and you can find the presentation below.&lt;/p&gt;

&lt;p&gt;&lt;iframe scrolling="no" margin src="http://www.slideshare.net/slideshow/embed_code/14069147" frameborder="0"&gt; &lt;/iframe&gt;&lt;/p&gt;
&lt;div style="margin-bottom: 5px;"&gt;&lt;strong&gt; &lt;a href="http://www.slideshare.net/amnigos/scalable-load-testing-using-jmeter-in-cloud" title="Scalable load testing using jmeter in cloud" target="_blank"&gt;Scalable load testing using jmeter in cloud&lt;/a&gt; &lt;/strong&gt; from &lt;strong&gt;&lt;a href="http://www.slideshare.net/amnigos" target="_blank"&gt;Vijay Rayapati&lt;/a&gt;&lt;/strong&gt;&lt;/div&gt;
&lt;p&gt;Would love to hear your feedback and do check out our new web based load testing service called&lt;a href="http://load.minjar.com/" title="Minjar Load Testing"&gt; Minjar CloudLoad&lt;/a&gt; and you can run tests with upto 1000 concurrent requests for free.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;</description><link>http://amnigos.com/post/44631348626</link><guid>http://amnigos.com/post/44631348626</guid><pubDate>Mon, 27 Aug 2012 15:48:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>JMeter</category><category>LoadTesting</category></item><item><title>AWS India Summit - 2012</title><description>&lt;p&gt;Amazon Web Services is organizing their second cloud summit in India across Bangalore, Chennai and Mumbai. You can find more details on &lt;a href="http://aws.amazon.com/apac/awssummit-in/?trk=pa_ku"&gt;AWS India Summit&lt;/a&gt; site.&lt;/p&gt;
&lt;p&gt;AWS summit focusses on bringing togther partners, existing customers and propsects to discuss and share ideas around Cloud Computing including best practices around AWS infrastructure. It&amp;#8217;s must attend event for architects and startups planning to use cloud infrastructure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Top 10 reasons mentioned on the AWS Summit page on why you should attend it?&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;span style="line-height: 14px; font-size: small;"&gt;Hear the opening keynote by Amazon.com CTO, &lt;/span&gt;&lt;a href="http://aws.amazon.com/apac/awssummit-in/?trk=webpage#werner-vogels" rel="ibox" style="color: #996633; text-decoration: none;"&gt;Dr. Werner Vogels&lt;/a&gt;&lt;span style="line-height: 14px; font-size: small;"&gt;, on the &lt;/span&gt;&lt;strong style="line-height: 14px; font-size: small;"&gt;future of the AWS Cloud&lt;/strong&gt;&lt;span style="line-height: 14px; font-size: small;"&gt; and learn about the &lt;/span&gt;&lt;strong style="line-height: 14px; font-size: small;"&gt;7 major transformations of cloud computing&lt;/strong&gt;&lt;span style="line-height: 14px; font-size: small;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;Ask questions directly to our customer panel&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt; about how they leverage AWS in their own line of business applications.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Learn about &lt;/span&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;The Total Cost of (Non) Ownership in the Cloud and cost savings&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt; using AWS.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Discover the &lt;/span&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;latest services and features&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt; in the AWS Cloud and learn how to put them to use in your business applications.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Deep dive into common solutions and workloads in the AWS Cloud: &lt;/span&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;Enterprise Applications, Content Delivery, Disaster Recovery, Big Data&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;, and more.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Gain understanding of AWS best practices for &lt;/span&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;developing, architecting, and securing applications&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt; in the Cloud.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Hear how &lt;/span&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;AWS customers&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt; have successfully built and migrated their applications to the AWS Cloud.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Learn from first-hand experiences about AWS empowering &lt;/span&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;Agile development&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt; for both startups and greenfield projects.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Explore best practices for running Microsoft applications on AWS securely and effectively.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: small; line-height: 14px;"&gt;Meet &lt;/span&gt;&lt;strong style="font-size: small; line-height: 14px;"&gt;AWS partners&lt;/strong&gt;&lt;span style="font-size: small; line-height: 14px;"&gt; who offer consulting and technology solutions to help get you started in the Cloud&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;span style="font-size: small;"&gt;Hope to see you at Bangalore event on 4th October.&lt;/span&gt;&lt;/p&gt;</description><link>http://amnigos.com/post/44631349280</link><guid>http://amnigos.com/post/44631349280</guid><pubDate>Sun, 19 Aug 2012 18:58:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>awsindiasummit</category><category>cloudcomputing</category></item><item><title>Top 10 Things To Know About Google Compute Engine</title><description>&lt;p&gt;Google took it&amp;#8217;s first significant step to enter Infrastructure-as-a-Service market by providing &lt;a href="http://cloud.google.com/products/compute-engine.html"&gt;Compute Engine&lt;/a&gt; to allow customers run on-demand virtual machines on their global network of data centers.For a while, it had good presence in &lt;a href="http://cloud.google.com/"&gt;Platform Cloud&lt;/a&gt; with GAE, Cloud Storage, BigQuery, Prediction and Translation APIs but their primary focus was to make it easier for developers to build new applications. With Compute Engine, it can now target developers to port their existing applications to Google Cloud.&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: medium;"&gt;&lt;strong&gt;Here are the top 10 things that you should know about Google Compute Engine&amp;#160;:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Pricing&lt;/strong&gt; is cheaper than Amazon EC2 or other public cloud platforms compute services. This might not be a significant advantage as Amazon is known to bring down costs almost every quarter.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Guest OS Support &lt;/strong&gt;- Currently it supports only CentOS or Ubuntu and by default starts your instances with Ubuntu 12.04 TLS server image.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Google Compute Engine persistent disk&lt;/strong&gt; can be attached to more than one instance in read-only mode. This would help usecases where you need to share certain docbase/config across instances without having to use rsync or nfs approaches.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Predictable Performance&lt;/strong&gt; - Strong claims of reliable and highly predictable performance from instances in Google Cloud unlike variable performance issues with most of the public cloud platforms (mostly due to shared nature of underlying physical resources) for large scale workloads or heavy consumption by another tenant. Some of the early&lt;a href="http://youtu.be/LCjSJ778tGU"&gt; customers are raving about reliable performance&lt;/a&gt; from Google Compute Engine instances.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Data Security&lt;/strong&gt; -  Another significant advantage where Google Compute Engine encrypts the data stored on the disks (both persistent and ephemeral) taking care of data-at-rest and also it encrypts data on the host before transmitting it to the network storage in case of persistent disk taking care of data-in-transit security issues. This would ease the data compliance and security constraints for enterprise applications.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. Networking - &lt;/strong&gt;Very high level of control to end users interms of creating and managing their instances network and firewalls. One interesting aspect is, you can have a private network and connect all your instances across different Google Cloud regions through it without having to go over public internet but using Google high performance global network.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;7. HPC Focus - &lt;/strong&gt;Google is currently focussing on bigdata, batch processing and hpc workloads for their compute engine which can offer very large scale computing resources. Given the predictable performance, high memory for core in any instance type, scalable cloud storage and data security it would be enticing for most large scale computations or workloads.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;8. Maintenance Windows -&lt;/strong&gt;  During their limited preview for developers, they would have pre-defined and notified maintenance windows in their data centers. It would cause your instances to be terminated and also your persistent disks won&amp;#8217;t be available for use during maintenance period. They &lt;a href="https://developers.google.com/compute/docs/robustsystems"&gt;encourage distributed deployments&lt;/a&gt; to avoid any issues.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;9. No IPV6 Support - &lt;/strong&gt;It doesn&amp;#8217;t support IPV6 but should be added in near future. Also if you need static ip address for your instances then need to request via email, I guess it would be fixed asap.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;10. Limited Preview -  &lt;/strong&gt; Google Compute Engine is in limited preview and you can &lt;a href="https://gce-signup.appspot.com/" style="font-size: 12px; line-height: 15px;"&gt;place your request here&lt;/a&gt; for access.&lt;/p&gt;

&lt;p class="p1"&gt;With Amazon, Google, Microsoft having strong focus on public cloud market and each of them trying to out innovate their offerings will be a good sign for most developers, startups and enterprises to leverage the real advantages of on-demand computing. It&amp;#8217;s more than pricing/platform war with accelerated innovation!&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;</description><link>http://amnigos.com/post/44631349993</link><guid>http://amnigos.com/post/44631349993</guid><pubDate>Fri, 29 Jun 2012 11:45:00 +0530</pubDate><category>JustMigrate</category><category>Cloud Engine</category><category>cloudcomputing</category><category>Google Cloud</category></item><item><title>Remote JMX monitoring of java application in AWS Cloud</title><description>&lt;p&gt;Recently, one of our cloud engineer was enabling JMX monitoring for a Java application deployed using AWS BeanStalk. While he could connect to it locally using JConsole, it was giving connection refused exception while connecting from the remote machine using JConsole but could connect to JMX port using Telnet..&lt;/p&gt;Standard configuration exported via CATALINA_OPTS of BeanStalk AMI to enable remote JMX monitoring:&lt;p&gt;&lt;/p&gt;&lt;strong&gt;-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8090 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false&lt;/strong&gt;I was troubleshooting the problem by enabling the debug option of JConsole on client side and found that RMI stub on server was running on a different port which was resulting in conneciton failure.&lt;p&gt;&lt;/p&gt;&lt;strong&gt;jconsole -J-Djava.util.logging.config.file=&amp;lt;path-to-log-properties-file&amp;gt;&lt;/strong&gt;I created a following file with required log properties for JConsole&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: x-small;"&gt;Logging.properties&lt;/span&gt;&lt;br/&gt;&lt;span style="font-size: x-small;"&gt;handlers = java.util.logging.ConsoleHandler&lt;/span&gt;&lt;br/&gt;&lt;span style="font-size: x-small;"&gt;.level = INFO&lt;/span&gt;&lt;br/&gt;&lt;span style="font-size: x-small;"&gt;java.util.logging.ConsoleHandler.level = FINEST&lt;/span&gt;&lt;br/&gt;&lt;span style="font-size: x-small;"&gt;java.util.logging.ConsoleHandler.formatter = \&lt;/span&gt;&lt;br/&gt;&lt;span style="font-size: x-small;"&gt;java.util.logging.SimpleFormatter&lt;/span&gt;&lt;span style="font-size: x-small;"&gt;&lt;/span&gt;&lt;br/&gt;&lt;span style="font-size: x-small;"&gt;javax.management.level = FINEST&lt;/span&gt;&lt;br/&gt;&lt;span style="font-size: x-small;"&gt;javax.management.remote.level = FINEST&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The debug log was showing the JConsole was unable to connect to RMI stub listening on a different port. From &lt;a href="https://blogs.oracle.com/jmxetc/entry/connecting_through_firewall_using_jmx"&gt;JMX monitoring documentation&lt;/a&gt;, it uses two ports - one for RMI registry (which was configured using about settings) and one where RMI connection objects are exported (this will be choosen by random unless we extend JMXServiceURL).&lt;/p&gt;For us the challenge was enabling all ports from EC2 machines so we setup a dedicated machine where JConsole is installed and opened ports from our BeanStack group to that specific machine for accessing.&lt;p&gt;&lt;/p&gt;&lt;br/&gt;</description><link>http://amnigos.com/post/44631350531</link><guid>http://amnigos.com/post/44631350531</guid><pubDate>Wed, 27 Jun 2012 17:19:00 +0530</pubDate><category>JustMigrate</category><category>cloudcomputing</category><category>JMX</category></item><item><title>Oracle Database Licensing In AWS Cloud</title><description>&lt;p&gt;I was evaluating about running an Enterprise Edition of Oracle on Amazon EC2 for datawarehouse application and stumbled upon the licensing policy which was quite different compared to physical hardware environements. I have simplified it for a quick understanding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Folllowing are the details of Standard and Enterprise Editions&amp;#160;:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Oracle Standard Edition&lt;/strong&gt; - On a single EC2 instance of upto 4 virtual cores, it would be considered 1 processor license. Also it&amp;#8217;s important to note that you are running Oracle SE on 2 EC2 machines, each with 1 core then you would need 2 processors licenses.&lt;/p&gt; &lt;strong&gt;2. Oracle Enterprise Edition &lt;/strong&gt;- On a single EC2 instance of 8 virtual  cores (platform with core processor licensing factor of 0.5) would  require 8 * 0.5 = 4 processor licenses. So if we have 4 virtual cores then we would need 2 processor license.
&lt;p&gt;You can find details about this from &lt;a href="http://www.oracle.com/us/corporate/pricing/cloud-licensing-070579.pdf"&gt;Oracle Cloud Documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;</description><link>http://amnigos.com/post/44631351019</link><guid>http://amnigos.com/post/44631351019</guid><pubDate>Fri, 22 Jun 2012 14:10:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>cloudcomputing</category><category>Licensing</category><category>Oracle</category></item><item><title>Really FAST : High Performance JSON Parser for JAVA</title><description>&lt;p&gt;I was evaulating JSON parsers with JAVA wrapper for use in one of our BigData projects. While there are 100&amp;#8217;s of them, our first criteria was to look for high performance (serialization and de-serialization) and robustness (ability to deal/scale well with large payloads). In our usecase,even 100ms of difference in parsing or creating JSON would mean good gains for us as it will be operating on 100&amp;#8217;s of million input records in HDFS through Map Reduce.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Here is what I stumbled upon:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;1. T&lt;a href="https://github.com/eishay/jvm-serializers/wiki"&gt;est cases and results&lt;/a&gt; for most of the JAVA JSON wrappers.&lt;/p&gt;
&lt;p&gt;2. &lt;a href="http://martinadamek.com/2011/02/04/json-parsers-performance-on-android-with-warmup-and-multiple-iterations/"&gt;JSON Peformance&lt;/a&gt; on Andriod with warmup.&lt;/p&gt;
&lt;p&gt;3. Google-gson &lt;a href="https://sites.google.com/site/gson/gson-performance"&gt;performance&lt;/a&gt; page.&lt;/p&gt;
&lt;p&gt;After looking at above benchmarking results and further digging, we choose to go with&lt;strong&gt; &lt;a href="http://jackson.codehaus.org/"&gt;Jackson as it proves to be FAST.&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;p.s&amp;#160;: we like json-simple and google-gson as well but speed matters to us&amp;#160;:&lt;/p&gt;</description><link>http://amnigos.com/post/44631351508</link><guid>http://amnigos.com/post/44631351508</guid><pubDate>Wed, 11 Apr 2012 19:06:00 +0530</pubDate><category>JustMigrate</category><category>Hadoop</category><category>json</category><category>Performance Tuning</category></item><item><title>Hadoop Optimization : Dealing with small files problem</title><description>&lt;p&gt;Hadoop is not really good at dealing with tons of small files and rather good at handling large files. Also too many small files increase the number of mappers, job coordination effort (task scheduling), less work for each map task and overall processing time.&lt;/p&gt;
&lt;p&gt;Input to Hadoop MapReduce process is abstracted by InputFormat used and FileInputFormat is a default implementation that deals with files in HDFS. With FileInputFormat, each file is splited into one or more InputSplits typically upper bounded by block size. This means the number of input splits is lower bounded by number of input files. This is not an ideal environment for MapReduce process when it&amp;#8217;s dealing with large number of small files, because overhead of coordinating distributed processes is far greater than when there is relatively small number of large files. &lt;/p&gt;
&lt;p&gt;You can overcome this problem by extending &lt;a href="http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/CombineFileInputFormat.html"&gt;&lt;strong&gt;CombineFileInputFormat&lt;/strong&gt;&lt;/a&gt;, implementing &lt;a href="http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/RecordReader.html"&gt;&lt;strong&gt;RecordReader&lt;/strong&gt;&lt;/a&gt; and then control the number of maps by specifiying mapred.max.split.size value to your Custom Jar command.&lt;/p&gt;
&lt;p&gt;Having a right number of mappers depending on the capacity of your cluster size, you can improve the overall effieciency at each map side by providing right configuration (io sort buffer and jvm heapsize). We have noticed &lt;strong&gt;significant performance improvements (over 40 to 50%)&lt;/strong&gt; for some customers while using CombineFileInputFormat over large number of small files.&lt;/p&gt;
&lt;p&gt;You need to be aware of the fact that processing large amount of data per  map is bad in case of task failures as recovery would take more time and would hurt overall processing latency. So you need to tune the data split for each map depending on your processing complexity, available resoources (especially memory) and intermediate output size.&lt;/p&gt;</description><link>http://amnigos.com/post/44631352032</link><guid>http://amnigos.com/post/44631352032</guid><pubDate>Thu, 01 Mar 2012 12:20:00 +0530</pubDate><category>JustMigrate</category><category>Hadoop</category><category>MapReduce</category><category>Performance Tuning</category></item><item><title>How To Guide : Tata Docomo 3G on Ubuntu 11.10</title><description>&lt;p&gt;It&amp;#8217;s pretty straight forward and follow below steps.&lt;/p&gt;
&lt;p&gt;1. Connect your 3G stick and boot up Ubuntu.&lt;/p&gt;
&lt;p&gt;2. Select network connections and click on the New Mobile Broadband Connection.&lt;/p&gt;
&lt;p&gt;3. Now select continue in the dialog box and select India as country &amp;amp; then click continue.&lt;/p&gt;
&lt;p&gt;4. Now it wil show list of service providers and DON&amp;#8217;T select &amp;#8220;Tata Docomo&amp;#8221; as it&amp;#8217;s for Photon+ not for 3G. Instead select, I don&amp;#8217;t know my provider option and enter &amp;#8220;TATA DOCOMO UMTS&amp;#8221;, click continue.&lt;/p&gt;
&lt;p&gt;5. Under billing dialog, select &amp;#8220;My plan is not listed&amp;#8221; option and enter &amp;#8220;tatadocomo3g&amp;#8221; as APN, just click confirm and save your settings.&lt;/p&gt;
&lt;p&gt;6.  Now under Network Connections, you could see &amp;#8220;TATA DOCOMO UMTS connection&amp;#8221; option and you can click on it. If it doesn&amp;#8217;t connetc to internet then just unplug &amp;amp; then re-plug you 3G stick.&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s it, you can use Tata Docomo 3G stick with your Ubuntu 11.10 :)&lt;/p&gt;</description><link>http://amnigos.com/post/44631352562</link><guid>http://amnigos.com/post/44631352562</guid><pubDate>Fri, 17 Feb 2012 23:41:00 +0530</pubDate><category>JustMigrate</category><category>littlehacks</category><category>tatadocomo</category><category>ubuntu</category></item><item><title>Big Data and Hadoop in Cloud - Leveraging Amazon EMR</title><description>&lt;p&gt;I did a talk last week at &lt;a href="http://barcampbangalore.org/bcb/bcb11/elastic-map-reduce-running-hadoop-and-big-data-in-cloud"&gt;Barcamp Bangalore&lt;/a&gt; on&lt;strong&gt; &amp;#8220;Big Data and Hadoop in Cloud - Leveraging Amazon EMR&amp;#8221;&lt;/strong&gt;. The focus was to help audience understand Big Data and how to leverage frameworks like Hadoop to build context and derive insights. As &lt;a href="http://en.wikipedia.org/wiki/Big_data"&gt;big data&lt;/a&gt; is becoming a common use case and we need distributed systems that can store and take advantage of parallel processing to analyze growing data sets.&lt;/p&gt;
&lt;p&gt;I spoke about Hadoop, Map Reduce in general and how to run  Hadoop Map Reduce jobs using Amazon EMR service. Also shared some  insights from managing hyper scale production Hadoop clusters and tuning  for performance in general – Think 68400&amp;#160;GB RAM, 26000 CPUs and 1700000  GB Disks &lt;img class="wp-smiley" src="http://barcampbangalore.org/bcb/wp-includes/images/smilies/icon_smile.gif" alt=":)"/&gt;&lt;/p&gt;
&lt;div style=""&gt;&lt;strong style="display: block; margin: 12px 0 4px;"&gt;&lt;a href="http://www.slideshare.net/amnigos/big-data-and-hadoop-in-cloud-leveraging-amazon-emr" title="Big Data and Hadoop in Cloud - Leveraging Amazon EMR" target="_blank"&gt;Big Data and Hadoop in Cloud - Leveraging Amazon EMR&lt;/a&gt;&lt;/strong&gt; &lt;iframe scrolling="no" margin src="http://www.slideshare.net/slideshow/embed_code/11525173" frameborder="0"&gt;&lt;/iframe&gt;
&lt;div style="padding: 5px 0 12px;"&gt;View more &lt;a href="http://www.slideshare.net/thecroaker/death-by-powerpoint" target="_blank"&gt;PowerPoint&lt;/a&gt; from &lt;a href="http://www.slideshare.net/amnigos" target="_blank"&gt;Vijay Rayapati&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Drop me a note if you have any specific comments. Would love to hear your feedback!&lt;/p&gt;</description><link>http://amnigos.com/post/44631353056</link><guid>http://amnigos.com/post/44631353056</guid><pubDate>Mon, 13 Feb 2012 21:20:00 +0530</pubDate><category>JustMigrate</category><category>Amazon EMR</category><category>cloudcomputing</category><category>Hadoop</category><category>MapReduce</category><category>Performance Tuning</category></item><item><title>How To Guide - Hadoop  MapReduce Debugging in Local Setup </title><description>&lt;p&gt;One of the important setups after you have &lt;a href="http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html"&gt;installed hadoop successfully &lt;/a&gt;and played with samples is to configure your development environment and figure out how to debug your map reduce programs written in Java.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hadoop can be installed in the local environment in 3 different modes&amp;#160;:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;1. Local Mode&lt;/p&gt;
&lt;p&gt;2. Pseudo Distributed Mode&lt;/p&gt;
&lt;p&gt;3. Fully Distributed Mode (Cluster)&lt;/p&gt;
&lt;p&gt;Typically you will be running your local hadoop setup in Pseudo Distributed Mode to leverage HDFS and Map Reduce(MR). However you cannot debug MR programs in this mode as each Map/Reduce task will be running in a separate JVM process so you need to switch back to &lt;strong&gt;Local mode&lt;/strong&gt; where you can run your MR programs in a single JVM process.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Configure Hadoop for Debugging&amp;#160;:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Run hadoop in local mode for debugging so mapper and reducer tasks run in a single JVM instead of separate JVMs. Below steps help you do it.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Configure Hadoop_Opts to enable debugging so when you run your Hadoop job, it will be waiting for the debugger to connect.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;(export HADOOP_OPTS=&amp;#8221;-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8008“)&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Configure &lt;strong&gt;fs.default.name&lt;/strong&gt; value in core-site.xml to &lt;strong&gt;file:/// &lt;/strong&gt;from hdfs://. You won&amp;#8217;t using hdfs in local mode.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Configure &lt;strong&gt;mapred.job.tracker&lt;/strong&gt; value in mapred-site.xml to &lt;strong&gt;local&lt;/strong&gt;. This will instruct Hadoop to run MR tasks in a single JVM.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Create debug configuration for Eclipse and set the port to 8008 - typicla stuff.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Run your hadoop job (it will be waiting for the debugger to connect) and then launch Eclipse in debug mode.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Also use your favorite profiler to understand code level hotspots&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;How do you debug your MR programs?.&lt;/p&gt;</description><link>http://amnigos.com/post/44631353593</link><guid>http://amnigos.com/post/44631353593</guid><pubDate>Fri, 10 Feb 2012 12:20:00 +0530</pubDate><category>JustMigrate</category><category>debugging</category><category>Hadoop</category><category>MapReduce</category></item><item><title>Amazon DynamoDB : Yet Another NoSQL but Powerful in Cloud</title><description>&lt;p&gt;I have been using NoSQL databases in Amazon Cloud and one of the issues that you will get into is variable IO as your datastore grows exponentially. While I am really kicked about &lt;a href="http://aws.amazon.com/dynamodb/"&gt;&lt;strong&gt;DynamoDB as fully managed NoSQL&lt;/strong&gt;&lt;/a&gt;, what makes it stand apart from others or running your own NoSQL cluster is &lt;strong&gt;Performance&lt;/strong&gt;. Yes, having SSD for storage is a real killer for controlling disk IO and automatic partioning for having a &lt;a href="http://allthingsdistributed.com/2012/01/amazon-dynamodb.html"&gt;predictable performance&lt;/a&gt; for your read/write queries. Also the free tier makes it easy to explore and benchmark with your data. Have you tried it yet?.&lt;/p&gt;

&lt;p&gt;&lt;iframe src="http://www.youtube.com/embed/oz-7wJJ9HZ0" frameborder="0"&gt;&lt;/iframe&gt;&lt;/p&gt;
&lt;p&gt;P.S&amp;#160;: AWS is having a &lt;a href="http://event.on24.com/r.htm?e=398929&amp;amp;s=1&amp;amp;k=0CBB1511BC82E9E3F1C0FF54F4360194&amp;amp;partnerref=LP_DynDB"&gt;free webinar on 15th Feb on DynamoDB&lt;/a&gt; - do register.&lt;/p&gt;</description><link>http://amnigos.com/post/44631354131</link><guid>http://amnigos.com/post/44631354131</guid><pubDate>Sat, 04 Feb 2012 13:03:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>cloud</category><category>cloud-performance</category><category>DynaoDB</category></item><item><title>Amazon S3 Object Expiration : Leveraging it for Cloud Backups</title><description>&lt;p&gt;One of the issues with storing large amounts of backups data in Amazon S3 is writing custom scripts to delete the data after certain timestamp and also It will be painful enough to manage and maintain pruning tasks when you have large scale heterogenous data for 100;s of application.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;object expiration&lt;/strong&gt; feature in Amazon Simple Storage will be very useful for this task and will make it easier for developers/users. You can learn more &lt;a href="http://docs.amazonwebservices.com/AmazonS3/latest/dev/UsingObjects.html"&gt;about it here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As mentioned on AWS website&amp;#160;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can define Object Expiration rules for a set of objects in your  bucket. Each expiration rule allows you to specify a prefix and an  expiration period in days. The prefix field (e.g. “logs/”) identifies  the object(s) subject to the expiration rule, and the expiration period  specifies the number of days from creation date (i.e. age) after which  object(s) should be removed. You may create multiple expiration rules  for different prefixes. After an Object Expiration rule is added, the  rule is applied to objects with the matching prefix that already exist  in the bucket as well as new objects added to the bucket. Once the  objects are past their expiration date, they will be queued for  deletion. You will not be charged for storage for objects on or after  their expiration date. Amazon S3 doesn’t charge you for using Object  Expiration. You can use Object Expiration rules on objects stored in  both Standard and Reduced Redundancy storage. Using Object Expiration  rules to schedule periodic removal of objects eliminates the need to  build processes to identify objects for deletion and submit delete  requests to Amazon S3.&lt;/p&gt;
&lt;/blockquote&gt;</description><link>http://amnigos.com/post/44631354646</link><guid>http://amnigos.com/post/44631354646</guid><pubDate>Wed, 04 Jan 2012 15:11:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>cloud</category></item><item><title>Performance Tuning : Why STRACE Is Your Best Friend?</title><description>&lt;p&gt;I have been working with many customers over last 6 months on fixing performance issues across different stacks in production deployments. One of the tool that always comes to the rescue is &lt;a href="http://linux.die.net/man/1/strace"&gt;&lt;strong&gt;strace&lt;/strong&gt;&lt;/a&gt; - a process diagnostic tool. You can install it using one of your favorite package manager if it&amp;#8217;s not there already.&lt;/p&gt;
&lt;p&gt;Most of the times tuning performance on a live production machine (if you have some decent scale) is like dealing with a patient in emergency ward - you have to act fast and fix things quickly. During any troubleshooting, there will be too many components in the system or application and you need time to understand the impact of them before getting into &amp;#8220;lets fix this&amp;#8221; mode.&lt;/p&gt;
&lt;p&gt;I generally get started with the first/last component in the chain (typically a web/app or db server) and try to identify what it is doing through strace utility - it helped me so many times to identify whether the issue was with apache/nginx/php-fpm/uwsgi/java processes or some other blocking for a resource like DB or accessing certain system call too many times due to bad code (imagine iterating over 1000&amp;#8217;s of records and accessing timezone locale every time instead of caching it) in the application.&lt;/p&gt;
&lt;p&gt;The simplest way to get diagnostic information is to attach your busy process (high cpu or memory) to the strace and just watch the system calls - it will be immensly helpful to know  process execution trace.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Attach a process to strace for diagnostics&amp;#160;:&lt;/strong&gt; $ strace -p &amp;lt;processid&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;To see all open and read calls of a process&amp;#160;:&lt;/strong&gt; $ strace -e trace=open, read, close, connect  -p &amp;lt;pid&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Capture strace output for a process&amp;#160;:&lt;/strong&gt; $ strace -p &amp;lt;pid&amp;gt; -o /file/path/debug.php-fpm.txt&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What is taking time?&lt;/strong&gt;&amp;#160;: $ strace -c -p &amp;lt;pid&amp;gt;&lt;/p&gt;
&lt;p&gt;And you can options like -s to specify the maximum size of the output string to more than the default 32. It is the most powerful tool for troubleshooting things in Linux environments.&lt;/p&gt;
&lt;p&gt;Happy debugging and good luck with tuning systems - the best job at times:)&lt;/p&gt;</description><link>http://amnigos.com/post/44631355206</link><guid>http://amnigos.com/post/44631355206</guid><pubDate>Wed, 14 Dec 2011 15:12:00 +0530</pubDate><category>JustMigrate</category><category>debugging</category><category>linux</category><category>magento</category><category>Performance Tuning</category><category>strace</category></item><item><title>Startups - In the END, Only One Thing Matters For The...</title><description>&lt;img src="http://25.media.tumblr.com/4d0051fe46416740564d169a05d2ee40/tumblr_mj72guu6hn1qzxezvo1_500.png"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;h2&gt;Startups - In the END, Only One Thing Matters For The World : SUCCESS!&lt;/h2&gt;&lt;p&gt;So &lt;a href="http://www.taggle.com/"&gt;&lt;strong&gt;Taggle &lt;/strong&gt;&lt;/a&gt;decided to move on instead of fighting for &lt;a href="http://en.wikipedia.org/wiki/Last_man_standing_%28gaming%29"&gt;“last man standing in the game”&lt;/a&gt; - you can read more &lt;a href="http://www.pluggd.in/taggle-shuts-shop-297/"&gt;about whole story on Pluggd.in&lt;/a&gt; and also Mahesh Murthy’s &lt;a href="http://techcircle.vccircle.com/500/mahesh-murthys-open-letter-to-john-kuruvilla-of-taggle/"&gt;open letter on VC Circle to Taggle&lt;/a&gt; and John Kuruvilla’s response to it {for more masala}.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Here is my take and why I disagree with Mahesh Murthy :&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With all due respects to Mahesh, what is differentiators for PinStorm in  the work they do? - how does it differ from other thousands of agencies out  there in the world.  As someone said, future prediction (you can  bullshit) and past analysis (you can reason to death) are the easiest  things to do but creating something is damn f***king hard. I was  surprised to see Mahesh’s stand on Air Deccan saying”I told you so and it  happened” because he generally preaches about “how predictions suck” during his  talks at conferences :P&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Failure vs Success in Startups :&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For the world, ultimately only one thing that matters in our business:&lt;strong&gt; &lt;a href="http://startuphoodlum.com/2011/12/06/winning-is-the-only-thing-that-matters/"&gt;whether or not you win&lt;/a&gt;.&lt;/strong&gt; Nobody cares about how you slogged or dragged.&lt;strong&gt; If you succeed then you will be glorified to &lt;span&gt;immortality &lt;/span&gt;but if you fail then some people will thrash you to death &lt;/strong&gt;{in some cases on how you suck and} on how your business sucked with every possible reason. As entreprenuer you will move on and do something that interests you  while blogs and media (if you are popular enough) write about how they  predicted failure or doomsday. My best wishes to Taggle team for trying something while hundred others were counting about how many deals sites are coming up, was I part of that Hundred? :)&lt;/p&gt;
&lt;p&gt;I believe taking a dig is much easier than building something valuable  and sustain it { And, also  going ahead and recreating success in another  area }. I tried startups two times and both of them failed terribly, as all of us know no body gives a damn shit about failures after a week - period.&lt;/p&gt;
&lt;p&gt;So as an entrepreneur, &lt;strong&gt;have fun while doing whatever you are doing&lt;/strong&gt; {even if people say you must be a moron to do such a stupid thing} so in the end atleast you had fun if not success {even though that’s what matters for the world}.&lt;/p&gt;

&lt;p&gt;Image Credit : Shamelessly &lt;a href="http://ripoffornot.org/images/gandhi1.png"&gt;copied from ripoffornot.org&lt;/a&gt; - go FreshDesk and do it cowboys :)&lt;/p&gt;
&lt;p&gt;P.S : &lt;span&gt;If you are building something and looking for smart guys then go hire engineers from the Taggle team.  I spoke to a few, they are really smart guys&lt;/span&gt; :)&lt;/p&gt;</description><link>http://amnigos.com/post/44631358440</link><guid>http://amnigos.com/post/44631358440</guid><pubDate>Thu, 08 Dec 2011 16:08:00 +0530</pubDate><category>JustMigrate</category><category>failures</category><category>startups</category></item><item><title>Building Culture at Kuliza - My talk at HR4Startups EventI was...</title><description>&lt;img src="http://24.media.tumblr.com/6072f54c39848c4b29f771069f873e9f/tumblr_mj72gzzlFq1qzxezvo1_500.jpg"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;h2&gt;Building Culture at Kuliza - My talk at HR4Startups Event&lt;/h2&gt;&lt;p&gt;I was part of panel discussion at &lt;a href="http://hr4startups2011.sched.org/"&gt;HR4Startups &lt;/a&gt;event, IIM-B on 3rd December along with Amiya from  ZipDial, Pallav from FusionCharts and Kumud from SuperSeva on hiring, people management, challenges and what worked for us.&lt;/p&gt;

&lt;p&gt;I did a presentation on our beliefs and culture at Kuliza in hiring and  having fun while doing what we are doing. Also why I believe &lt;strong&gt;“Culture is what separates great companies from others. Not Technology”&lt;/strong&gt;.&lt;/p&gt;

&lt;div style=""&gt;&lt;strong style="display: block; margin: 12px 0 4px;"&gt;&lt;a href="http://www.slideshare.net/amnigos/building-culture-at-kuliza-10451863" title="Building Culture at Kuliza" target="_blank"&gt;Building Culture at Kuliza&lt;/a&gt;&lt;/strong&gt; 
&lt;object height="426" width="510"&gt;
&lt;param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=kulizahr4startups-111203234905-phpapp02&amp;stripped_title=building-culture-at-kuliza-10451863&amp;userName=amnigos"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowScriptAccess" value="always"&gt;&lt;param name="wmode" value="transparent"&gt;&lt;embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=kulizahr4startups-111203234905-phpapp02&amp;stripped_title=building-culture-at-kuliza-10451863&amp;userName=amnigos" type="application/x-shockwave-flash" wmode="transparent" height="426" width="510"&gt;&lt;/embed&gt;&lt;/object&gt;
&lt;div style="padding: 5px 0 12px;"&gt;&lt;a href="http://www.slideshare.net/amnigos" target="_blank"&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div style=""&gt;If you are running a startup then what worked for you?&lt;/div&gt;</description><link>http://amnigos.com/post/44631361665</link><guid>http://amnigos.com/post/44631361665</guid><pubDate>Sun, 04 Dec 2011 11:27:00 +0530</pubDate><category>JustMigrate</category><category>Culture</category><category>Hiring</category><category>Kuliza</category><category>startups</category></item><item><title>Netflix - Leveraging Public Cloud (AWS)</title><description>&lt;p&gt;Netflix is poster boy of public cloud adoption for running large scale systems and their &lt;a href="http://techblog.netflix.com/"&gt;engineering blog&lt;/a&gt; has always shared most of their learnings and experience using AWS cloud - from designing fault tolerant systems, scaling simpleDB to Cassandra. The below presentation from &lt;a href="http://www.slideshare.net/adrianco" class="userimage-link j-tooltip-bottom"&gt;&lt;span class="h-username" style="vertical-align: top;"&gt;Adrian Cockcroft&lt;/span&gt;&lt;/a&gt; shares their global platform details including why AWS cloud :).&lt;/p&gt;
&lt;div style=""&gt;&lt;strong style="display: block; margin: 12px 0 4px;"&gt;&lt;a href="http://www.slideshare.net/adrianco/global-netflix-platform" title="Global Netflix Platform" target="_blank"&gt;Global Netflix Platform&lt;/a&gt;&lt;/strong&gt; 
&lt;object&gt;
&lt;param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=globalnetflixplatform-111120125427-phpapp01&amp;amp;stripped_title=global-netflix-platform&amp;amp;userName=adrianco"&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;param name="allowScriptAccess" value="always"&gt;&lt;param name="wmode" value="transparent"&gt;&lt;embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=globalnetflixplatform-111120125427-phpapp01&amp;amp;stripped_title=global-netflix-platform&amp;amp;userName=adrianco" type="application/x-shockwave-flash" wmode="transparent"&gt;&lt;/embed&gt;&lt;/object&gt;
&lt;div style="padding: 5px 0 12px;"&gt;View more &lt;a href="http://www.slideshare.net/" target="_blank"&gt;presentations&lt;/a&gt; from &lt;a href="http://www.slideshare.net/adrianco" target="_blank"&gt;Adrian Cockcroft&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;</description><link>http://amnigos.com/post/44631362217</link><guid>http://amnigos.com/post/44631362217</guid><pubDate>Tue, 29 Nov 2011 17:35:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>cloud</category><category>netflix</category></item><item><title>Cassandra in AWS Cloud :Summary from AWS User Group Bangalore...</title><description>&lt;img src="http://25.media.tumblr.com/bec1a7bb15067facc26bccc0f087525e/tumblr_mj72h9mKGk1qzxezvo1_500.jpg"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;h2&gt;Cassandra in AWS Cloud :Summary from AWS User Group Bangalore Meetup - November&lt;/h2&gt;&lt;p&gt;I have co-hosted &lt;a href="http://www.meetup.com/AmazonAWS-Bangalore"&gt;AWS Cloud User Group Bangalore&lt;/a&gt; meetup for November at Kuliza Technologies with &lt;a href="http://www.gnuyoga.in"&gt;Sreekandh &lt;/a&gt;&amp; &lt;a href="http://www.vivekjuneja.in/"&gt;Vivek&lt;/a&gt;.  This meetup theme was “Running Cassandra in Cloud” and was attended by around 15+ interested in exploring NoSQL solutions like &lt;a href="http://cassandra.apache.org/"&gt;Cassandra&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We started the meetup with introductions, tribe forming exercise (it was fun) and divided into 3 groups so each group can present one of the topics from introduction to NoSQL, hands-on Cassandra, schema design, CAP theoram, scalability and performance.&lt;/p&gt;
&lt;p&gt;I did a quick hands-on session to run Cassandra in a single node and presented overview of all configuration parameters. All you need &lt;a href="http://wiki.apache.org/cassandra/GettingStarted"&gt;to run a Cassandra node&lt;/a&gt; was Java 1.6 runtime and you can download binaries for Winows (Dont forget to set your JAVA_HOME variable). We discussed the need for using different disks for Commit logs and Data  directory including how to leverage Row Cache or Key Cache in Cassandra  for improving Read performance in different usecases including various Read/Write consistency models available.&lt;/p&gt;
&lt;p&gt;Also Cassandra ships with a default &lt;a href="http://wiki.apache.org/cassandra/CassandraCli"&gt;commandline client&lt;/a&gt; (cassandra-cli) which can be used to connect to server, create keyspaces/column families including writing/reading data. You can use one of the high level client bindings to work at programming level.&lt;/p&gt;
&lt;p&gt;If you are interested in launching a Cassandra cluster without doing too much work then you can explore &lt;a href="http://whirr.apache.org/docs/0.6.0/quick-start-guide.html"&gt;Whirr &lt;/a&gt;- an opensource cluster service that allows you to launch cloud based clusters for Cassandra, Hadoop, HBase etc in 10 mins.&lt;/p&gt;
&lt;p&gt;It was lot of fun as you can see from the below pics :)&lt;/p&gt;

&lt;p&gt;Interested in attending or hosting the next session? - join us at &lt;a href="http://www.meetup.com/AmazonAWS-Bangalore"&gt;AWS User Group &lt;/a&gt;on Meetup.com.&lt;/p&gt;</description><link>http://amnigos.com/post/44631374325</link><guid>http://amnigos.com/post/44631374325</guid><pubDate>Mon, 28 Nov 2011 10:12:00 +0530</pubDate><category>JustMigrate</category><category>aws</category><category>awsugblr</category><category>bangalore</category><category>bigdata</category><category>cassandra</category><category>cloud</category><category>cloud-performance</category></item><item><title>Heroku Launches DB as a Service for PostgreSQL</title><description>&lt;p&gt;Heroku has launched a DBaaS for PostgreSQL - this is will be very useful given that PostgresSQL has large commuinty of users. You can sign up for it at &lt;a href="http://postgres.heroku.com/"&gt;&lt;a href="http://postgres.heroku.com/"&gt;http://postgres.heroku.com/&lt;/a&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Also Heroku pitch is (as taken from their site)&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;A powerful, reliable, and durable open-source SQL-compliant database, &lt;a href="http://postgres.heroku.com/#postgres"&gt;PostgreSQL&lt;/a&gt; is the datastore of choice for serious applications. Now it is &lt;a href="http://postgres.heroku.com/#single"&gt;available in seconds&lt;/a&gt; with a single click. Never worry about servers. Never worry about  config files. Never worry about patches. Simply focus on your data. &lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Databases are &lt;a href="http://postgres.heroku.com/#ingress"&gt;multi-ingress&lt;/a&gt;&amp;#160;; use them from any cloud, PaaS, or your local computer. It is easy to connect from common languages &amp;amp; frameworks  including Rails, Django, PHP, and Java: &lt;a href="http://postgres.heroku.com/#strings"&gt;configuration strings&lt;/a&gt; are generated for them automatically. Need to test a schema migration or perform load testing? &lt;a href="http://postgres.heroku.com/#fork"&gt;Fork&lt;/a&gt; your database to create an exact copy of your schema and data&amp;#8221;.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt; Scale vertically by choosing from a range of &lt;a href="http://postgres.heroku.com/#plans"&gt;plans&lt;/a&gt; . Plans differ based on the size of their hot-data-set, the portion of  data available and optimized on-the-fly in high speed RAM. When the time  comes, scale horizontally by adding read-only &lt;a href="http://postgres.heroku.com/#follow"&gt;followers&lt;/a&gt; that stay up-to-date with the master database.&lt;/li&gt;
&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Forget daily backups, &lt;a href="http://postgres.heroku.com/#protect"&gt;Continuous Protection&lt;/a&gt; redundantly archives data to high-durability storage as it is written, ensuring that it is safe no matter what. &lt;a href="http://postgres.heroku.com/#health"&gt;Automated health-checks&lt;/a&gt; are performed every 30 seconds to ensure that databases are available  and working. And if something goes wrong, there is an ops team on call  24/7. &lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Would you use a DBaaS?&lt;/p&gt;</description><link>http://amnigos.com/post/44631374880</link><guid>http://amnigos.com/post/44631374880</guid><pubDate>Wed, 23 Nov 2011 16:50:00 +0530</pubDate><category>JustMigrate</category><category>DBaaS</category><category>Heroku</category></item></channel></rss>
