Pages

Friday, 14 March 2014

Benefits of WebSocket

HTML5 Web Sockets:
A Quantum Leap in Scalability for the Web

By Peter Lubbers & Frank Greco, Kaazing Corporation
(This article has also been translated into Bulgarian.)

Lately there has been a lot of buzz around HTML5 Web Sockets, which defines a full-duplex communication channel that operates through a single socket over the Web. HTML5 Web Sockets is not just another incremental enhancement to conventional HTTP communications; it represents a colossal advance, especially for real-time, event-driven web applications.
HTML5 Web Sockets provides such a dramatic improvement from the old, convoluted "hacks" that are used to simulate a full-duplex connection in a browser that it prompted Google's Ian Hickson—the HTML5 specification lead—to say:
"Reducing kilobytes of data to 2 bytes…and reducing latency from 150ms to 50ms is far more than marginal. In fact, these two factors alone are enough to make Web Sockets seriously interesting to Google."
Let's take a look at how HTML5 Web Sockets can offer such an incredibly dramatic reduction of unnecessary network traffic and latency by comparing it to conventional solutions.

Polling, Long-Polling, and Streaming—Headache 2.0

Normally when a browser visits a web page, an HTTP request is sent to the web server that hosts that page.  The web server acknowledges this request and sends back the response.  In many cases—for example, for stock prices, news reports, ticket sales, traffic patterns, medical device readings, and so on—the response could be stale by the time the browser renders the page. If you want to get the most up-to-date "real-time" information, you can constantly refresh that page manually, but that's obviously not a great solution.
Current attempts to provide real-time web applications largely revolve around polling and other server-side push technologies, the most notable of which is Comet, which delays the completion of an HTTP response to deliver messages to the client. Comet-based push is generally implemented in JavaScript and uses connection strategies such as long-polling or streaming.
With polling, the browser sends HTTP requests at regular intervals and immediately receives a response.  This technique was the first attempt for the browser to deliver real-time information. Obviously, this is a good solution if the exact interval of message delivery is known, because you can synchronize the client request to occur only when information is available on the server. However, real-time data is often not that predictable, making unnecessary requests inevitable and as a result, many connections are opened and closed needlessly in low-message-rate situations.
With long-polling, the browser sends a request to the server and the server keeps the request open for a set period. If a notification is received within that period, a response containing the message is sent to the client. If a notification is not received within the set time period, the server sends a response to terminate the open request. It is important to understand, however, that when you have a high message volume, long-polling does not provide any substantial performance improvements over traditional polling.  In fact, it could be worse, because the long-polling might spin out of control into an unthrottled, continuous loop of immediate polls.
With streaming, the browser sends a complete request, but the server sends and maintains an open response that is continuously updated and kept open indefinitely (or for a set period of time). The response is then updated whenever a message is ready to be sent, but the server never signals to complete the response, thus keeping the connection open to deliver future messages. However, since streaming is still encapsulated in HTTP, intervening firewalls and proxy servers may choose to buffer the response, increasing the latency of the message delivery. Therefore, many streaming Comet solutions fall back to long-polling in case a buffering proxy server is detected. Alternatively, TLS (SSL) connections can be used to shield the response from being buffered, but in that case the setup and tear down of each connection taxes the available server resources more heavily.
Ultimately, all of these methods for providing real-time data involve HTTP request and response headers, which contain lots of additional, unnecessary header data and introduce latency. On top of that, full-duplex connectivity requires more than just the downstream connection from server to client. In an effort to simulate full-duplex communication over half-duplex HTTP, many of today's solutions use two connections: one for the downstream and one for the upstream. The maintenance and coordination of these two connections introduces significant overhead in terms of resource consumption and adds lots of complexity. Simply put, HTTP wasn't designed for real-time, full-duplex communication as you can see in the following figure, which shows the complexities associated with building a Comet web application that displays real-time data from a back-end data source using a publish/subscribe model over half-duplex HTTP.

Figure 1—The complexity of Comet applications
Comet headaches

It gets even worse when you try to scale out those Comet solutions to the masses. Simulating bi-directional browser communication over HTTP is error-prone and complex and all that complexity does not scale. Even though your end users might be enjoying something that looks like a real-time web application, this "real-time" experience has an outrageously high price tag. It's a price that you will pay in additional latency, unnecessary network traffic and a drag on CPU performance.

HTML5 Web Sockets to the Rescue!

Defined in the Communications section of the HTML5 specification, HTML5 Web Sockets represents the next evolution of web communications—a full-duplex, bidirectional communications channel that operates through a single socket over the Web. HTML5 Web Sockets provides a true standard that you can use to build scalable, real-time web applications. In addition, since it provides a socket that is native to the browser, it eliminates many of the problems Comet solutions are prone to. Web Sockets removes the overhead and dramatically reduces complexity.
To establish a WebSocket connection, the client and server upgrade from the HTTP protocol to the WebSocket protocol during their initial handshake, as shown in the following example:

Example 1—The WebSocket handshake (browser request and server response)



Once established, WebSocket data frames can be sent back and forth between the client and the server in full-duplex mode. Both text and binary frames can be sent full-duplex, in either direction at the same time. The data is minimally framed with just two bytes. In the case of text frames, each frame starts with a 0x00 byte, ends with a 0xFF byte, and contains UTF-8 data in between. WebSocket text frames use a terminator, while binary frames use a length prefix.
Note: although the Web Sockets protocol is ready to support a diverse set of clients, it cannot deliver raw binary data to JavaScript, because JavaScript does not support a byte type. Therefore, binary data is ignored if the client is JavaScript—but it can be delivered to other clients that support it.

The Showdown: Comet vs. HTML5 Web Sockets

So how dramatic is that reduction in unnecessary network traffic and latency? Let's compare a polling application and a WebSocket application side by side.
For the polling example, I created a simple web application in which a web page requests real-time stock data from a RabbitMQ message broker using a traditional publish/subscribe model. It does this by polling a Java Servlet that is hosted on a web server. The RabbitMQ message broker receives data from a fictitious stock price feed with continuously updating prices. The web page connects and subscribes to a specific stock channel (a topic on the message broker) and uses an XMLHttpRequest to poll for updates once per second. When updates are received, some calculations are performed and the stock data is shown in a table as shown in the following image.

Figure 2—A JavaScript stock ticker application
Stock ticker application example

Note: The back-end stock feed actually produces a lot of stock price updates per second, so using polling at one-second intervals is actually more prudent than using a Comet long-polling solution, which would result in a series of continuous polls. Polling effectively throttles the incoming updates here.
It all looks great, but a look under the hood reveals there are some serious issues with this application. For example, in Mozilla Firefox with Firebug (a Firefox add-on that allows you to debug web pages and monitor the time it takes to load pages and execute scripts), you can see that GET requests hammer the server at one-second intervals. Turning on Live HTTP Headers (another Firefox add-on that shows live HTTP header traffic) reveals the shocking amount of header overhead that is associated with each request. The following two examples show the HTTP header data for just a single request and response.

Example 2—HTTP request header



Example 3—HTTP response header


Just for fun, I counted all the characters. The total HTTP request and response header information overhead contains 871 bytes and that does not even include any data! Of course, this is just an example and you can have less than 871 bytes of header data, but I have also seen cases where the header data exceeded 2000 bytes. In this example application, the data for a typical stock topic message is only about 20 characters long. As you can see, it is effectively drowned out by the excessive header information, which was not even required in the first place!
So, what happens when you deploy this application to a large number of users? Let's take a look at the network throughput for just the HTTP request and response header data associated with this polling application in three different use cases.
  • Use case A: 1,000 clients polling every second: Network throughput is (871 x 1,000) = 871,000 bytes = 6,968,000 bits per second (6.6 Mbps)
  • Use case B: 10,000 clients polling every second: Network throughput is (871 x 10,000) = 8,710,000 bytes = 69,680,000 bits per second (66 Mbps)
  • Use case C: 100,000 clients polling every 1 second: Network throughput is (871 x 100,000) = 87,100,000 bytes = 696,800,000 bits per second (665 Mbps)
That's an enormous amount of unnecessary network throughput! If only we could just get the essential data over the wire. Well, guess what? You can with HTML5 Web Sockets! I rebuilt the application to use HTML5 Web Sockets, adding an event handler to the web page to asynchronously listen for stock update messages from the message broker (check out the many how-tos and tutorials on tech.kaazing.com/documentation/ for more information on how to build a WebSocket application). Each of these messages is a WebSocket frame that has just two bytes of overhead (instead of 871)! Take a look at how that affects the network throughput overhead in our three use cases.
  • Use case A: 1,000 clients receive 1 message per second: Network throughput is (2 x 1,000) = 2,000 bytes = 16,000 bits per second (0.015 Mbps)
  • Use case B: 10,000 clients receive 1 message per second: Network throughput is (2 x 10,000) = 20,000 bytes = 160,000 bits per second (0.153 Mbps)
  • Use case C: 100,000 clients receive 1 message per second: Network throughput is (2 x 100,000) = 200,000 bytes = 1,600,000 bits per second (1.526 Mbps)
As you can see in the following figure, HTML5 Web Sockets provide a dramatic reduction of unnecessary network traffic compared to the polling solution.
Figure 3—Comparison of the unnecessary network throughput overhead between the polling and the WebSocket applications
Polling versus Web Sockets unnecessary header overhead

And what about the reduction in latency? Take a look at the following figure. In the top half, you can see the latency of the half-duplex polling solution. If we assume, for this example, that it takes 50 milliseconds for a message to travel from the server to the browser, then the polling application introduces a lot of extra latency, because a new request has to be sent to the server when the response is complete. This new request takes another 50ms and during this time the server cannot send any messages to the browser, resulting in additional server memory consumption.
In the bottom half of the figure, you see the reduction in latency provided by the WebSocket solution. Once the connection is upgraded to WebSocket, messages can flow from the server to the browser the moment they arrive. It still takes 50 ms for messages to travel from the server to the browser, but the WebSocket connection remains open so there is no need to send another request to the server.

Figure 4—Latency comparison between the polling and WebSocket applications
Web Sockets versus Comet latency comparison

HTML5 Web Sockets and the Kaazing WebSocket Gateway

Today, only Google's Chrome browser supports HTML5 Web Sockets natively, but other browsers will soon follow. To work around that limitation, however, Kaazing WebSocket Gateway provides complete WebSocket emulation for all the older browsers (I.E. 5.5+, Firefox 1.5+, Safari 3.0+, and Opera 9.5+), so you can start using the HTML5 WebSocket APIs today.
WebSocket is great, but what you can do once you have a full-duplex socket connection available in your browser is even greater. To leverage the full power of HTML5 Web Sockets, Kaazing provides a ByteSocket library for binary communication and higher-level libraries for protocols like Stomp, AMQP, XMPP, IRC and more, built on top of WebSocket.
Figure 5—Kaazing WebSocket Gateway extends TCP-based messaging to the browser with ultra high performance
Kaazing Web Sockets Architecture

Summary

HTML5 Web Sockets provides an enormous step forward in the scalability of the real-time web. As you have seen in this article, HTML5 Web Sockets can provide a 500:1 or—depending on the size of the HTTP headers—even a 1000:1 reduction in unnecessary HTTP header traffic and 3:1 reduction in latency. That is not just an incremental improvement; that is a revolutionary jump—a quantum leap!
Kaazing WebSocket Gateway makes HTML5 WebSocket code work in all the browsers today, while providing additional protocol libraries that allow you to harness the full power of the full-duplex socket connection that HTML5 Web Sockets provides and communicate directly to back-end services. For more information about Kaazing WebSocket Gateway, visit kaazing.com and the Kaazing technology network at tech.kaazing.com.

Summary




About HTML5 WebSockets

The HTML5 WebSockets specification defines an API that enables web pages to use the WebSockets protocol for two-way communication with a remote host. It introduces the WebSocket interface and defines a full-duplex communication channel that operates through a single socket over the Web. HTML5 WebSockets provide an enormous reduction in unnecessary network traffic and latency compared to the unscalable polling and long-polling solutions that were used to simulate a full-duplex connection by maintaining two connections.
HTML5 WebSockets account for network hazards such as proxies and firewalls, making streaming possible over any connection, and with the ability to support upstream and downstream communications over a single connection, HTML5 WebSockets-based applications place less burden on servers, allowing existing machines to support more concurrent connections. The following figure shows a basic WebSocket-based architecture in which browsers use a WebSocket connection for full-duplex, direct communication with remote hosts.
HTML5 WebSockets Architecture
One of the more unique features WebSockets provide is its ability to traverse firewalls and proxies, a problem area for many applications. Comet-style applications typically employ long-polling as a rudimentary line of defense against firewalls and proxies. The technique is effective, but is not well suited for applications that have sub-500 millisecond latency or high throughput requirements. Plugin-based technologies such as Adobe Flash, also provide some level of socket support, but have long been burdened with the very proxy and firewall traversal problems that WebSockets now resolve.
A WebSocket detects the presence of a proxy server and automatically sets up a tunnel to pass through the proxy. The tunnel is established by issuing an HTTP CONNECT statement to the proxy server, which requests for the proxy server to open a TCP/IP connection to a specific host and port. Once the tunnel is set up, communication can flow unimpeded through the proxy. Since HTTP/S works in a similar fashion, secure WebSockets over SSL can leverage the same HTTP CONNECT technique. Note that WebSockets are just beginning to be supported by modern browsers (Chrome now supports WebSockets natively). However, backward-compatible implementations that enable today's browsers to take advantage of this emerging technology are available.
WebSockets—like other pieces of the HTML5 effort such as Local Storage and Geolocation—was originally part of the HTML5 specification, but was moved to a separate standards document to keep the specification focused. WebSockets has been submitted to the Internet Engineering Task Force (IETF) by its creators, the Web Hypertext Application Technology Working Group (WHATWG). Authors, evangelists, and companies involved in the standardization still refer to the original set of features, including WebSockets, as "HTML5."

The WebSocket Protocol

The WebSocket protocol was designed to work well with the existing Web infrastructure. As part of this design principle, the protocol specification defines that the WebSocket connection starts its life as an HTTP connection, guaranteeing full backwards compatibility with the pre-WebSocket world. The protocol switch from HTTP to WebSocket is referred to as a the WebSocket handshake.
The browser sends a request to the server, indicating that it wants to switch protocols from HTTP to WebSocket. The client expresses its desire through the Upgrade header: 



If the server understands the WebSocket protocol, it agrees to the protocol switch through the Upgrade header.



At this point the HTTP connection breaks down and is replaced by the WebSocket connection over the same underlying TCP/IP connection. The WebSocket connection uses the same ports as HTTP (80) and HTTPS (443), by default.
Once established, WebSocket data frames can be sent back and forth between the client and the server in full-duplex mode. Both text and binary frames can be sent in either direction at the same time. The data is minimally framed with just two bytes. In the case of text frames, each frame starts with a 0x00 byte, ends with a 0xFF byte, and contains UTF-8 data in between. WebSocket text frames use a terminator, while binary frames use a length prefix.
WebSocket Frame

Using the HTML5 WebSocket API

With the introduction of one succinct interface (see the following listing), developers can replace techniques such as long-polling and "forever frames," and as a result further reduce latency. 



Utilizing the WebSocket interface couldn't be simpler. To connect to an end-point, just create a new WebSocket instance, providing the new object with a URL that represents the end-point to which you wish to connect, as shown in the following example. Note that a ws:// and wss:// prefix are proposed to indicate a WebSocket and a secure WebSocket connection, respectively.


A WebSocket connection is established by upgrading from the HTTP protocol to the WebSockets protocol during the initial handshake between the client and the server. The connection itself is exposed via the "onmessage" and "send" functions defined by the WebSocket interface.
Before connecting to an end-point and sending a message, you can associate a series of event listeners to handle each phase of the connection life-cycle as shown in the following example. 



To send a message to the server, simply call "send" and provide the content you wish to deliver. After sending the message, call "close" to terminate the connection, as shown in the following example. As you can see, it really couldn't be much easier.






WebSocket Demos

Here you will find an aggregated list of WebSocket demos from all over the Web.


Thumbnail of Echo Test Echo Test
Test a WebSocket connection from your browser against our WebSocket echo server.



Thumbnail of Kaazing.meKaazing.me
Several examples of WebSocket together on the one page. The bar along the bottom of the screen lets you connect to Google Chat over WebSocket.
Mr. Doob's Multiuser Sketchpad
A shared sketchpad that everyone can draw on at the same time. Websockets are used to keep up with the position data shared between users.
Thumbnail of RumpetrollRumpetroll
Norwegian for "tadpole", swim around with other rumpetrolls. WebSockets communicate positional data between all users.
Thumbnail of WordSquaredWordSquared
A massively multiplayer online Scrabble-like word game.






WebGL Aquarium
This is a 100% HTML5, WebGL and JavaScript. Each of the 8 machines is running Chrome. They communicate using WebSockets to coordinate the differenct screens.
The WebSocket Difference
A video that visually demonstrates the performance improvements you get by using WebSocket.
Video Sync with WebSocket
Shows a single controller controlling the video that all connected users are watching.

HTML5 WebSocket

What is WebSocket?

The WebSocket specification—developed as part of the HTML5 initiative—introduced the WebSocket JavaScript interface, which defines a full-duplex single socket connection over which messages can be sent between client and server. The WebSocket standard simplifies much of the complexity around bi-directional web communication and connection management.
WebSocket represents the next evolutionary step in web communication compared to Comet and Ajax. However, each technology has its own unique capabilities. Learn how these technologies vary so you can make the right choice. 

This second section walks you through creating a WebSocket application yourself. 

You can also inspect WebSocket messages using your browser

Creating your own test

Using a text editor, copy the following code and save it as websocket.html somewhere on your hard drive. Then simply open it in a browser. The page will automatically connect, send a message, display the response, and close the connection.



Google OpenID authentication in PHP

Google OpenID authentication in PHP

This post explains you about how to implement Google OpenID authentication in PHP.
In this post we are using  Openid API for google authentication. After download this script change database connectivity in inc/dbConnect.inc.php file and change $callback_website_url variable in login_process.php file.

Download Link   Demo Link

Sample database design for table name contact.
This table contains  id (primary key),tutorial and link.



dbConnect.inc.php 
Contains database connectivity code

 
index.php 
   
login_process.php 
Contains the login part and include openid api file.

 
google_landing.php 
 After Authentication, google will redirect page


 

Thursday, 27 February 2014

Announcing Spark: authoring improvements for Drupal 7 and Drupal 8

At DrupalCon Denver, I announced the need for a strong focus on Drupal's authoring experience in my State of Drupal presentation. During my core conversation later in the week, I announced the creation of a Drupal 7 distribution named "Spark" (formerly code-named "Phoenix"). The goal of Spark is to act as an incubator for Drupal 8 authoring experience improvements that can be tested in the field.
I hope for Spark to provide a "safe space" to prototype cutting-edge interface design and to build excellent content tools that are comparable with the experience of proprietary alternatives. While not a final list, some initial thinking around the features we want to experiment with is:
  • Inline editing and drag-and-drop content layout tools ("true" WYSIWYG)
  • Enhanced content creation: auto-save, save as draft and more
  • Useful dashboards for content creators
  • Mobile content authoring and administration support
The vision behind the Spark distribution is to be "the Pressflow of Drupal authoring experience". Pressflow provided a "spoon" of Drupal 6 with various performance enhancements that made their way into Drupal 7 core while it was in development. The same improvements were made available to Drupal 6 users so they could easily be tested in the field. With Spark, we want to test authoring experience improvements in Drupal 7 on real sites with real users and real content. We also want to target the best improvements for inclusion into Drupal 8 core.
I'm excited to announce that Acquia will fund the Spark distribution. Core developers Gábor Hojtsy and Wim Leers will work on Spark full-time starting in late May. They will work along side Angie Byron (webhchick), Alex Bronstein (effulgentsia), myself and other members at Acquia. While we have some promising candidates so far, Acquia is still seeking applicants to join the Spark team (with a strong preference to candidates located in or willing to move to the Boston area):
The Spark team will collaborate with the Drupal usability and the core

Understanding The New Drupal Release Cycle

At the end of 2013, big changes were made to the Drupal release cycle.
You can read full details of the changes at https://drupal.org/node/2135189.
The changes will start with Drupal 8, but are going to impact almost all current and future versions.
In this blog post, I'll give you a short, plain-English overview of the changes.
I'll show how the changes will impact users of Drupal 6, 7, 8 and 9.

What is changing for Drupal 8?

  • Drupal 8 will have several releases with new features.
  • New Drupal 8 releases will come out every 6 months. They may have new features and will be called 8.1, 8.2, 8.3 etc.
  • The latest release will be the only one that's supported.
  • The end of the process will be a Long Term Support version or LTS. When that LTS arrives, no new features will be added, only security fixes.
Here's a graphical overview of how the releases will work:
media_1393365446214.png
Here's the big change: new features can now be added after the 8.0 release.
One of the downsides to Drupal's old release cycle is that new features couldn't be added.
Drupal 7 was essentially finished in 2010 (although it was released the next year). So no new features have been added to Drupal since 2010.
That feature freeze is great for enterprise customers, but very clunky for today's software developers who are used to working iteratively, making fast and regular updates.
In the proposal, Dries said that the new release cycle allows them to use "a more agile development approach of getting smaller changes out faster, seeing how they do, and making further changes based on real-world data. "
In the comments, Jesse Beach (who is responsible for much of the work on Drupal's UI) said, "Building a usable UI requires numerous iterations and we really don't get those over large major release cycles."

What does this mean for Drupal 9?

  • Drupal 9 will only start development when the the core team have completed a new feature big enough to justify a new version.
  • Drupal 9 should arrive on a normal schedule, which is 2 or 3 years after Drupal 8.

What does this mean for Drupal 6 and 7?

Wait, why are we bringing 6 and 7 into this?
Because this new release cycle may be good news for owners of Drupal 6 and 7 sites:
  • Drupal 6 would be security supported until 8's LTS rather than only until 8.0.
  • Drupal 7 would be security supported until 9's LTS rather than only until 9.0.
So this change could mean an extra 18 months of fixes. It's possible that this could mean Drupal 7 support until 2018 or 2019.

What does this mean for you building sites in the future?

Be very careful with the selection and quantity of modules you use in Drupal 8. This is good practice anyway, but will become a more formal recommendation with Drupal 8.
Here's the advice from the release cycle proposal:
"Don't use custom modules; stick to only contrib modules that are adequately maintained. Or, if your site requires custom code, make sure the developers you hire know to limit their code to only accessing APIs marked as stable."
In other words, changes may happen between Drupal 8.0 and the Drupal 8 LTS. If you build you own modules or use poorly supported custom modules, watch carefully in case updates cause problems.

Wait, doesn't this all sounds very familiar?

Yes, this release cycle is very similar to those adopted by Joomla and Typo3 amongst others.
This release cycle is one form of Semantic Versioning which is adopted by many products: http://semver.org.