At what stage should scalability be addressed?

In this post I will try to be as non-partisan as possible with respect to technology, platforms, programming languages, databases etc. Do forgive me if I fail in this attempt.

Here’s the question on hand:
When should business, leadership, dev teams and other stakeholders address the scalability question.

It’s a dilema I face and others depend on my advise so I’m pooling in your wisdon.

Allow me to ellaborate the question. We’re building a product and we have two choices:

Choice 1

Go in for a technology that is extremely popular. It has been around in the mainstream for at least half a decade. It’s a good business choice because of the maturity of the environment, the availability of talent / outsourcing firms and positive feedback overall.

However there are some questions over scalability. There are many case-studies where headline grabbing companies have had to abandon this technology and go in for a complete / partial rewrite on another platform.

The argument here goes like this:

“If we use this technology and we face scalability problems few months down the line, it’s a good problem to have because it means we have done really well. So lets go ahead with this for now, even if it means a rewrite six months down the line.”

Choice 2

We can opt for another much younger technology. It has an active community but comparatively less mature environment / toolkits. In comparison to Choice 1, it is harder to find good and affordable talent.

The programming language here is extremely expressive yet heavily misunderstood.

However there are bona fide scaling credentials. It’s been a choice for organisations that are proven technology leaders. And its perhaps a safer bet in the long run on the scalability parameter.

Your call

As a programmer / hacker / geek, Choice 2 is an obvious winner. At least for me its a no-brainer.

However put yourself in a product owner / manager / CTO / decision-making role. Does your opinion change?

Welcoming all your comments, thanks!

Really simple file uploads with Node.js and Express

Few days ago I was working on a fairly typical web application and I faced the challenge of implementing a fairly typical web application feature – file uploads. It was the first time I was implementing file uploads with Node (and Express) and I did what anyone else would do – I googled it.

Unfortunately all the articles / posts out there are either outdated, too complex or plain wrong. So I did the next most obvious thing – post a question on the mailing list. As always Mr. Holowaychuk was incredibly quick to respond. His answer lead me to do what I should have done in the first place – read the docs.

The upload form

This is the most obvious part of the challenge. You’re probably familiar with this already. Anyway, for the sake of completeness of this article, here it is.

You will need a form in your browser for the file upload. I use Jade to generate my HTML and here how it looks:

form(action="...", method="post", enctype="multipart/form-data")
  input(type="file", name="displayImage")

The form.action will point to a route that handles the file upload. More below.

Accessing the uploaded file

If you’re using recents versions of Node and Express, file uploads are a piece of cake. And I’ll back this claim but before we go any further make sure you’re familiar with routes, requests and responses in Express.

Okay, now let’s justify the “piece of cake” claim. In our file upload route, the req parameter has req.files available. Here’s an example of what the req.files would contain:

{
  displayImage: {
    size: 11885,
    path: '/tmp/1574bb60b4f7e0211fd9ab48f932f3ab',
    name: 'avatar.png',
    type: 'image/png',
    lastModifiedDate: Sun, 05 Feb 2012 05:31:09 GMT,
    _writeStream: {
      path: '/tmp/1574bb60b4f7e0211fd9ab48f932f3ab',
      fd: 14,
      writable: false,
      flags: 'w',
      encoding: 'binary',
      mode: 438,
      bytesWritten: 11885,
      busy: false,
      _queue: [],
      drainable: true
    },
    length: [Getter],
    filename: [Getter],
    mime: [Getter]
  }
}

In the req.files object above, the property displayImage is the name of the file field in your HTML form and req.files will contain one property each for every valid HTML file form field.

The file object contains the type, size and name properties for your server side validations.

Saving the uploaded file

Assuming the file is valid, you use the path property for the next step. The path would typically contain a location in the tmp folder. Your application logic could either require you to access the contents of the file or simply move the uploaded file to another location.

fs.readFile(req.files.displayImage.path, function (err, data) {
  // ...
  var newPath = __dirname + "/uploads/uploadedFileName";
  fs.writeFile(newPath, data, function (err) {
    res.redirect("back");
  });
});

In the fs.readFile callback, we have the data parameter through which we can access the contents of the file. The example above is taken from an application that needed to modify the file and save it in a new location. Thus fs.writeFile is used to write data to the newPath.

If your app needs to simply move the uploaded file without modifying the contents fs.rename can be used as more simpler option.

That’s all there is to it. I’ve done file uploads in many server side languages including Python, Java, Scala and PHP and I don’t think its ever been this simple.

So much for JavaScript being labeled as an inferior server side language.

JSFoo Chennai 2012 – Ajax is history – Build real time apps in JavaScript

Submitted a proposal for JSFoo Chennai 2012 titled Ajax is history – Build real time apps in JavaScript.

Quick preview:

Objective
How to build the next generation of real-time browser based apps with JavaScript and related technologies.

Description
Ajax is history. After having built Review19 — Review19.com provides next generation real-time collaborative tools — I’d like to share my approach, the technologies used, the architecture followed by a quick demo and code walk through.

Requirements
* Good sense of humor
* Appreciation for random katrina kaif pics inserted between slides
* Knowledge of web app development in general
* More JavaScript (Node, NPM etc.) stack knowledge, the better

If you’re interested, don’t forget to vote.

How I built a real-time collaborative app with MongoDB

This is an account of how Review19 was built in just a couple of weeks and how MongoDB was an excellent choice for this genre of browser based, real-time collaborative apps.

First things first — what is Review19? Review19 is the next generation, real-time project collaboration tool for web, creative and software teams. Think of it as a minimalist, real-time Pivotal Tracker alternative. It’s like Trello, but caters to a focused target audience.

Give it a shot, try out Review19. If you’re stuck behind a proxy that doesn’t like web sockets, here’s the failover link.

The Technology Stack

Obviously an app can’t be built with a data store alone.

In addition to a MongoDB based storage engine, the APIs and backend for Review19 are built on the Node.js platform and make use of popular frameworks including Express and Socket.io.

The frontend client app — which is under heavy refactoring as we speak — leverages advances in browser capability with respect to JavaScript execution and CSS renderings. The frameworks here include jQuery, jQuery UI and Sugar among others.

Why am I mentioning all of this? The overall technology stack needs to be known to highlight a huge advantage:

JSON. Everywhere!

I’ve been a professional programmer for ten years now and I’ve had the fortune (misfortune?) of working with a diverse set of technologies.

I happen to have dealt with:

  • the typical enterprise Java stack – Spring, Hibernate etc.
  • recent advances in Java technologies including Scala, LiftWeb, Play! etc.
  • built apps in Python using Django, Tornado and Google App Engine
  • built apps in PHP and its set of frameworks
  • built apps in Adobe Flex working with different types backend APIs

In each of these technologies, a developer has to deal with the mundane task of transforming his model between the client, server and data store. Better development teams would automate this process with the use of ORMs or helper frameworks. However this would still need the inclusion of these dependencies and the configuration of these frameworks.

While developing Review19 with the chosen technologies, things worked out of the box. It was JSON on the server, JSON on the client and, thanks to MongoDB, JSON on the data store. The productivity benefit was huge, especially for an independent developer like me, or small teams with tough deadlines.

The advantage of this cannot be underplayed. It strikes you right away when you build your first proof-of-concept and it stays with you throughout your development cycle.

Flexibility is not just a buzz-word

About 10-12 months ago, I was working on a project for an SF Bay startup that wanted to built an analytics / BI suite running on top of customer databases.

It wasn’t a very complex project, but just a large and poorly managed one. I remember being asked repeatedly about how the database was being versioned. The answer “it was being versioned in the code” wasn’t good enough for the ‘expert’ and his evil overlords.

Hmm.. generating and storing multiple versions of the data model as an .sql file in your source code repository. Sounds familiar? Of course it does. And it sounds just as painful as it sounds familiar. Even if this process is somehow painstakingly automated and devs don’t have to deal with it any more, you still often end up with multiple version of SQL files never being used.

Whatever your application be, the entities / models are bound to evolve with time. Things will change, sometimes so much that they won’t share even the slightest resemblance with the original designs.

With MongoDB, however, the data store adapts seamlessly to your app. There isn’t the slightest need to run scripts, create tables, alter columns etc. on the data store prior to each deployment. This is how it’s done for Review19 and I’d assume for a majority of projects using MongoDB.

This will bring a small morale boost and a large productivity boost for your dev and devops teams.

Node.js + MongoDB = MongoJS

MongoDB is one of the most mature NoSQL options available today and that maturity is proven by the rich set of APIs MongoDB provides that allow you to perform basic and advanced operations on your data.

Furthermore the MongoDB Querying APIs are ridiculously simple, yet powerful and easy to master. In fact its so easy to learn that you wouldn’t miss the very same SQL that you spent years to master.

A majority of MongoDB evangelists and proponents I’ve interacted with swear by this capability. After spending sometime on the MongDB Interactive Shell, I could see what they meant.

I needed to access the same capabilities in my Node.js backend app. Node.js has got a well maintained MongoDB driver and a lot of interesting abractions on top of that driver.

My choice zeroed in on MongoJS and the clinching argument was the MongoJS mission statement:

“A node.js module for mongodb, that emulates the mongodb API as much as possible.”

Your actual executable JavaScript code looks exactly the same to what you do the MongoDB console. How can this not be a #win? I like minimalism and MongoJS was as minimalist as possible.

There are a lot of other options as well for Node.js – MongoDB interaction and it would be best to cover them in detail in a future post.

Speed

Did I mention Review19 is incredibly fast? Almost in all the feedback received, the users mention among other things how fast it is and how it improved their experience.

I understand that application architecture has a role to play in this but one can’t take all the credit for the speed and performance witnessed by Review19 users. One must thank the underlying platforms it was built on which includes Node.js and MongoDB.

I’m sure you’ll find all the MongoDB speed benchmarks you want. They’re just a Google search away. What I intend to highlight here is that in a real world collaborative app which has a single instance running on standard, non-fancy hardware, Node.js and MongoDB are serving the needs brilliantly.

Remember this is a real-time collaboration app where instead of frequent smaller requests, a frontend client has a long lived connection with the backend server. This is exactly the kind of app that would have faced scaling issues few years ago and would have needed a lot of infrastructure both in terms of code and hardware.

Node.js and MongoDB make this economical for independent devs like me without the resources big corps can provide.

Conclusion

I’m extremely pleased with the technology choices made in which of course MongoDB is a critical component. I’ve completed the entire product cycle from conceptualization and design to testing, development and deployments.

Described above are ways MongoDB made this enitre process simpler, kept deployments easy and provided all the features and more needed by Review the app.

Now I look forward the the next set of challenges which include maintaining, scaling and enhancing the system. And I promise to let you know how that goes. Stay tuned to this blog! ;-)

How Trello is *not* different

Joel Spolsky, a figure lot of us look up to, made a very interesting post about Trello and how it is different.

Note – Before reading any further, you should know I am the creator of Review19, a tool quite similar to Trello.

I loved Spolsky’s post, except the title. Trello didn’t seem different at all. Going by a lot of Spolsky’s parameters, Review19 and Trello are very similar.

Here’s trying to shed more light on my claim:

Hosted Only

Review19 – Yes
Trello – Yes

Both Review19 and Trello are hosted only solutions, at least as of now. You cannot buy a license and install privately on your servers.

Continuous Delivery

Review19 – Yes
Trello – Yes

Just like Google Chrome — which still maintains versions — but is released / delivered continuously for all intents and purposes both Review19 and Trello share this development / delivery model.

Inexhaustive Testing

Review19 – Yes
Trello – Yes

Joel’s team at Fog Creek, by choice, don’t exhaustively test Trello before each release. Review19 is being development and maintained by a single individual, i.e. me, it’s not really possible to exhaustively test Review19 either. Sorry about that!

Work in public

Review19 – Yes
Trello – Yes

Trello works in public by offering a public status board. Review19 puts itself out their through a public mailing list and roadmap. Volunteers are added to a private status board.

Get big fast

Review19 – Yes
Trello – Yes

Trello aims to get 100 million users. I’d be an idiot if I didn’t want that for Review19 either. ;-)

Free

Review19 – Yes
Trello – Yes

Plugin architecture

Review19 – Yes
Trello – Yes

The merits of such an architecture are obvious. Review19 uses client and server side JavaScript modules for achieving such an architecture. I’m not sure how Trello does it but we have no reason to doubt Joel’s remark.

Node.js

Review19 – Yes
Trello – Yes

MongoDB

Review19 – Yes
Trello – Yes

Web Sockets

Review19 – Yes
Trello – Yes

How Review19 is different

Horizontal

Review19 – No
Trello – Yes

Trello is for everybody. Review19 is for distributed or colocated web, creative and software teams. Spolsky notes the benefits of being vertical, I agree.

Spolsky wants Trello to be horizontal, I can understand. I did consider making Review19 horizontal but chose against it.

Video conferencing

Review19 – Yes
Trello – No

Review19′s target audience — web, creative and software teams — are often distributed around the world. Offering an in-browser, video conferencing option is a critical feature for Review19.

CoffeeScript

Review19 – No
Trello – Yes

No thanks, I see far too many benefits of staying with JavaScript at least as of now.

APIs top priority?

Review19 – I wish!
Trello – Yes

I wish! I’m only a single guy working full-time on multiple projects to earn my living. Review19 will have APIs, at some point of time!