Background

About two years ago, Jeff Cross and I created deployd, an interactive development environment (IDE) and API server for rapidly prototyping REST APIs for HTML5 and mobile apps. As front end developers we wanted to be able to easily create backends so we had an entire stack to test. It turned out that what we built was very useful for prototyping but lacked some features that larger companies require to go from prototype to final product.

A little over a year ago I started working on a new project at StrongLoop called LoopBack. The idea was to improve on the Deployd design. From the beginning we set out to build a production grade API server that would fit the requirements of large organizations building APIs for HTML5 and mobile apps. In this article we’ll go over what I consider the first three generations of API servers, how they influenced the design of LoopBack, and how they can help you build a large scale production grade REST API.

The First Generation: Birth of the API Server

The first generation style API server were built using frameworks designed originally for building web applications. From Java this is Spring MVC or Struts. Ruby, PHP, and Python have frameworks like Rails, Zend, and Django. The heyday of these frameworks were before Node.js, but several come to mind from the Node world: Sails, CompoundJS, and geddy.

Most of these MVC frameworks include a database abstraction (typically an ORM for RDBMS only), templating libraries for rendering HTML, and a base controller for gluing the two together. This gives developers a lot of flexibility while creating a clear separation between view and data. For the typical web application where views are rendered on the server, this is everything you need.

As browser clients became more complex, using new features like AJAX and the ability to render templates on the client, the MVC based backend was stretched to serve as both the web application and as the API server. Controllers would implement some methods that returned an entire rendered page while some methods that would return data.

For simple browser applications this worked well enough. As mobile applications proliferated and MVC moved towards the client, the separation became clear. You would build an MVC application and an API server. The MVC application would provide the user interface and the API server would provide the data.

The Second Generation: The Thin Server

Once it was clear that developers were building two separate applications, the focus of frameworks shifted. With web applications moving into the client and an entirely separate API server, both could be thinned out. The web application no longer could directly access the database. This access instead could be passed through to the client. This was made possible by a new set of frameworks and databases: MongoDB, CouchDB, Sinatra, Express, Deployd, Meteor and others.

These new technologies let developers build much thinner servers or “noBackend” at all. This was made possible by making “data on the wire” a first class citizen. Typically this meant surfacing database style APIs to JSON document databases.

Complex browser and mobile applications need “data on the wire” to transfer state to and from various data sources (databases, backend services, etc). The second generation of API servers and frameworks made this a first class and an “out of the box” feature. This is only the beginning for a data rich browser or mobile application. Access to raw data is important, but clients require much more from an API server. Data aggregation, security, validation, and denormalization are usually a requirement.

The Next Generation: Optimized API Delivery

We learned several lessons from the second generation of API servers. Providing a pass-through API to a database does not scale to large applications that require access to many data sources. Packaging features that make building an API simple is helpful. Delivering these features in a black box makes customization painful. Finally, REST and HTTP are becoming standard for API servers, but should not dictate the design of our API server.

Open Up the Black Box

Gen II API servers made building a backend for a mobile or web application a lot easier. This came with a cost. Technologies like CouchDB, Meteor, Deployd, MongoDB, and others do so much for you that getting something to work wasn’t the problem. When memory of your production machines was skyrocketing or requests were taking too long, there wasn’t much you could do. The idea of “noBackend” doesn’t have to mean you install a black box for your API and it must work without modification. Next generation API servers should provide all the out of the box features clients require, especially the hard parts, and provide hooks everywhere. You should even be able to replace entire parts of the API server if it is required by your application.

In order for the hooks or customizations to be practical the API server should provide deep introspection APIs. In LoopBack we provide the app.remotes() API. It returns your APIs HTTP routes, schema definitions, and all other metadata associated with your API server. This is useful for scaffolding client applications, generating documentation, and anything else that requires access to your API’s metadata.

Data Access

Providing a pass-through API to a database does not scale to large applications that require access to many data sources. Large applications use a myriad of data sources: databases, other REST APIs, proprietary services, etc.  Instead an API server should provide an abstract API to these data sources that allows you to describe relationships and do ad hoc aggregation and filtering.

Second generation data access was largely dependent on databases, such as MongoDB and CouchDB, that made creating a proxy API simple. Applications can easily outgrow the proxy API. An example of this is a couchapp that provided a statistic database. Users could query several years worth of statistical data. This would occasionally bring the couchapp down. One approach to fix this problem would be to restrict access to the statistic database. Only providing an API that accesses a pre-aggregated set of data. This is pretty difficult to do with CouchDB. Our approach in LoopBack is to make it trivial to add these kind of aggregate APIs. Take a look at remote methods for more.

A lot of time developing an API is spent implementing security rules from scratch. This inspired us to include access control lists (ACLs) in LoopBack. Instead of writing middleware or other functions to validate a given request, LoopBack allows you to define a list of access controls. The following is a basic example from a banking application.

slc lb acl --allow --owner --read --write --model account

The commands above generate the access control lists in this JSON config file. Now only the user that owns an account may read or write it. For more see the entire LoopBack Access Control Example.

Aggregation and Mashup

Many frameworks claim to do aggregation simply through “joining” of data. For example OpenJPA and Hibernate provide related entity support. The issue is that relationship only exists in one data source.

Another approach to aggregation is to create a set of workers that aggregate and denormalize data from various sources and drop that data into a JSON document database. For example, all the data required by a user’s homepage might span many data sources. This can be aggregated into a single document and easily fetched when the homepage is loaded. The issue with this approach is the latent ahead-of-time denormalization. The homepage will be out of date if data has changed after the aggregation.

Java frameworks like Teiid were created to solve this issue. They provide an abstraction for data access that supports integration of distributed data sources without latent copying or moving data. In LoopBack we provide relation and inclusion APIs for ad hoc data aggregation. This allows you to define relationship between data, even from distributed data sources, and request the aggregate data based on that relationship.

The following example would get a product, product details, and related products over REST. Many setups are possible as far as data sources; in the following the product and product detail data could come from an inventory database, and the related products could come from a separate backend REST or even SOAP API.

curl http://api.myapp.com/api/products/42
 ? filter[include] = productDetails
 & filter[include]  = relatedProducts

{
 "name": "pencil",
 "productDetails": {
   "availableColors": ["red", "blue"]

 },
 "relatedProducts": [
   {"id": 43, "name": "pen"}
 ]
}

REST

Next generation API servers should treat REST as the transport, not the programming model. API servers should adhere to REST and make it simple to implement a REST API. However, the client application should drive the programming model for the API server. In LoopBack we treat the API server as the “M” in “MVC”. This allows your API to flex to the demand of its clients. If your clients and server share the same model, you can compose the same models across your various clients.

It is also important for API servers to not confuse REST with HTTP. The programming model mentioned above means protocols such as web sockets and others are just an implementation detail. You shouldn’t have to re-implement your API server just to provide a client access to another protocol. The models and rules you define for your API should work for any protocol.

Recap

Generation
Pattern
Primary Use Case
Features
1
MVC
Web Applications
  • Templating
  • ORM
  • AJAX
  • Routing
2
Thin Server
Browser MVC and Mobile Apps
</span>
  • JSON
  • Document Databases
  • Data on the Wire
3
REST
REST APIs
  • REST
  • Security/ACLs
  • Data Aggregation
  • Distributed Data Sources

Browser and mobile applications are increasingly complex and require sophisticated data access. Several generations of API servers and frameworks have risen to meet this demand by supporting the requirements of these rich applications out of the box. The next generation of API servers and frameworks should support not only “data on the wire”, but aggregation, security and other features required by increasingly complex clients. If you are looking at building an API server, LoopBack provides the next generation of features that allow you to support the requirements of even richer client applications.

What’s next?

  • Install LoopBack with a simple npm command
  • What’s in the upcoming Node v0.12 version? Big performance optimizations, read the blog by Ben Noordhuis to learn more.
  • Need performance monitoring, profiling and cluster capabilites for your Node apps? Check out StrongOps!