You are here

General Knowledge

What Happens in One Internet Minute

Facebook’s fifth Open Compute Summit is on the horizon for January 2014, and looking at what the social networking leader has been working on over the past year provides some clues on what to expect to be in the spotlight at OCPSummit V.

In the Internet.org paper issued in September 2013, Facebook, Ericsson and Qualcomm noted that boosting datacenter efficiency is essential to bringing down costs and expanding Internet access across the globe. Offering many lessons for enterprises working to improve real-time responsiveness to compete in our information economy without breaking the bank, the paper noted that integrated efforts have the potential to increase datacenter efficiency by an astounding 100x over the next five to 10 years.

This achievement requires two key innovations outlined by Facebook. First of all, the underlying costs of delivering data need to be reduced. Secondly, we need to streamline data use by building more efficient applications. It is no coincidence that flash memory in the datacenter is playing an important role in both areas.

The Need for Efficiency: Very Big Data

It is clear that efficiency is essential when you examine the volume and velocity of the data we now create daily. In the Internet.org paper, Facebook notes that every day, more than 4.75 billion content items are shared on Facebook. Intel’s “Internet Minute” infographic notes that there are 1.3 million YouTube views every 60 seconds. And Cisco predicts that today’s astounding Internet traffic will grow 4.5x by 2017.

Data about data is now frequently shared online, ironically quantifying something IT professionals know very well: At work and at play, we generate huge amounts of information daily. But if our own consumption were not enough, the growth in analytics surpasses even the growth in Internet traffic. User data in Facebook’s warehouses grew 4,000 times between 2009 and 2013 even as the number of users grew 5.5 times from 200 million to 1,115 billion.

Facebook notes that analytics-driven personalization of each user’s experience required processing “tens of thousands of pieces of data, spread across hundreds of different servers, in a few tens of microseconds.” The complexity meant Facebook was processing 1000 times more traffic inside its data centers as information traveled between servers and clusters, than the traffic into and out of its facilities. Cisco predicts that data center traffic will grow 3x, reaching 7.7 zettabytes by 2017.

While few enterprises are faced with the challenge of processing this much information today, Facebook’s solutions for managing data at scale present important lessons for companies of all sizes, on boosting IT efficiency while guarding the bottom line.

Saving millions while scaling services

Flash memory in data center servers is quickly becoming critical to achieving efficiency in scalable IT. Facebook and other leading hyperscale and enterprise companies have been using flash for processing more transactions using fewer servers, thereby reducing operating costs. Besides being a thousand times faster than disk drives, modern flash uses much less energy than yesterday’s spinning disk storage, or even the DRAM memory in servers. Flash creates less heat, and thus requires far less energy to cool. It also requires less space in the datacenter compared to racks of disks, for even greater savings and efficiency.

All three Facebook data centers easily beat the industry gold standard of 1.5 Power Usage Effectiveness (PUE), with Facebook reporting that its PUEs range from 1.04 to 1.09. In a recent report, Businessweek noted that Facebook’s Sweden data center is, "By all public measures…the most energy-efficient computing facility ever built."

As Facebook notes in its Internet.org paper, "Making the delivery of data faster and more efficient is crucial to ensuring access to more people in more diverse regions of the world." Beyond the philanthropic appeal of expanding internet access and improving quality of life for billions, these lower-cost infrastructures create new markets for businesses of all sizes, as more and more consumers are able to participate in the global information economy.

The importance of optimizing applications for efficiency

While Facebook’s attention is focused on reducing the amount of data used by devices on the consumer side of the Internet, there are also interesting lessons for enterprises to learn on how data can be streamlined on the server side. For example, flash memory finally makes it possible to move beyond disk-era application code. By streamlining software code and removing the unnecessary layers of complexity associated with archaic storage architectures, applications perform faster while using less data. For instance, data management systems can shortcut costly address translation layers, avoid costly double buffering, and eliminate locking and resource overheads associated with input-output operations that remain outstanding for long periods of time.

To break free from the limitations of disk-era architectures, new open source APIs available on Github make it possible for enterprise software developers to add flash-aware operations to their applications. These APIs speed up application performance by optimizing for all-electronic solid state flash memory’s efficient protocols and processes, rather than assuming data access paradigms best suited for mechanical disk drive heads moving across spinning platters. Flash APIs can lower capital expenditures by helping flash last longer, because they reduce the number of data volume write operations in half. Flash-aware APIs further simplify development, helping developers to get applications to market faster. Open-source data-tier application innovators MariaDB and Percona already exploit Open NVM Atomic Writes API in their code.

Capitalizing on Strategic Change in IT

As enterprises balance the need to scale with the need to manage costs, Facebook’s lessons are increasingly important for companies operating in the information economy. Across open source and traditional computing, flexible solutions optimized with intelligence will be critical assets for tomorrow’s business leaders.

With the democratization of data being led by Facebook and the Open Compute project, many leaders are already working to understand and adopt these key practices to ensure they remain competitive amidst ongoing IT industry transitions. At OCPSummitV, it will be interesting to see what breakthroughs continue to add uncommon value to common standards.

Can Hadoop help your organization

Hadoop will have a profound impact on the enterprise data landscape. Understanding the common patterns of use, can greatly reduce the complexity.

Just a few weeks ago, Apache Hadoop 2.0 was declared generally available–a huge milestone for the Hadoop market as it unlocks the vision of interacting with stored data in unprecedented ways. Hadoop remains the typical underpinning technology of “Big Data,” but how does it fit into the current landscape of databases and data warehouses that are already in use? And are there typical usage patterns that can be used to distill some of the inherent complexity for us all to speak a common language?

Real time scenarios of Hadoop use

Hadoop was originally conceived to solve the problem of storing huge quantities of data at a very low cost for companies like Yahoo, Google, Facebook and others. Now, it is increasingly being introduced into enterprise environments to handle new classes of data. Machine-generated data, sensor data, social data, web logs and other such types are growing exponentially, but also often (but not always) unstructured in nature. It is this type of data that is turning the conversation from “data analytics” to “big data analytics”: because so much insight can be gleaned for business advantage.

Analytic applications come in all shapes and sizes–and most importantly, are oriented around addressing a particular vertical need. At first glance, they can seem to have little relation to each other across industries and verticals. But in reality, when observed at the infrastructure level, some very clear patterns emerge: they can fit into one of the following three patterns.

Pattern 1: Data refinery

The “Data Refinery” pattern of Hadoop usage is about enabling organizations to incorporate these new data sources into their commonly used BI or analytic applications. For example, I might have an application that provides me a view of my customer based on all the data about them in my ERP and CRM systems, but how can I incorporate data from their web sessions on my website to see what they are interested in? The “Data Refinery” usage pattern is what customers typically look to.

The key concept here is that Hadoop is being used to distill large quantities of data into something more manageable. And then that resulting data is loaded into the existing data systems to be accessed by traditional tools–but with a much richer data set. In some respects, this is the simplest of all the use cases in that it provides a clear path to value for Hadoop with really very little disruption to the traditional approach. No matter the vertical, the refinery concept applies. In financial services, we see organizations refine trade data to better understand markets or to analyze and value complex portfolios. Energy companies use big data to analyze consumption over geography to better predict production levels. Retail firms (and virtually any consumer-facing organization) often use the refinery to gain insight into online sentiment. Telecoms are using the refinery to extract details from call data records to optimize billing. Finally, in any vertical where we find expensive, mission critical equipment, we often find Hadoop being used for predictive analytics and proactive failure identification. In communications, this may be a network of cell towers. A restaurant franchise may monitor refrigerator data.

Pattern 2: Data exploration with Apache Hadoop

The second most common use case is one we call “Data Exploration.” In this case, organizations capture and store a large quantity of this new data (sometimes referred to as a data lake) in Hadoop and then explore that data directly. So rather than using Hadoop as a staging area for processing and then putting the data into the enterprise data warehouse–as is the case with the Refinery use case–the data is left in Hadoop and then explored directly.

The Data Exploration use case can often be where enterprises start by capturing data that was previously being discarded (exhaust data such as web logs, social media data, etc.) and building entirely new analytic applications that use that data directly. Nearly every vertical can take advantage of the exploration use case. In financial services, we find organizations using exploration to perform forensics or to identify fraud. A professional sports team will use data science to analyze trades and their annual draft, like we saw in the movie Moneyball. Ultimately, data science and exploration are used to identify net new business opportunities or net new insight in a way that was once impossible before Hadoop.

Pattern 3: Application enrichment

The third and final use case is “Application Enrichment.” In this scenario, data stored in Hadoop used to direct an application’s behavior. For example, by storing all web session data (i.e. all of the session histories of all users on a web page), we can customize the experience for a customer when they return to the website. By storing all this data in Hadoop, we can keep a session history from which we’re able to generate real value–for example by providing a timely offer based on a customer’s web history.

For many of the large web properties in the world–Yahoo, Facebook and others–this use case is foundational to their business. By customizing the user experience, they are able to differentiate in a significant way from their competitors. This was the second use case for Hadoop at Yahoo as it realized Hadoop could help improve ad placement. This concept translates beyond the large web properties and is being used by the more traditional enterprise to improve sales. Some brick and mortar organizations are even using these concepts to implement dynamic pricing in their retail outlets.

As one might expect, this is most typically the last use case to be adopted–generally once organizations have become familiar with refining and exploring data in Hadoop. But at the same time, this also hints at how Hadoop usage can and will evolve over time to serve an ever greater number of applications that are served by the traditional database today.

There is certainly complexity involved when any new platform technology makes its way into a corporate IT environment, and Hadoop is no exception. Whether you’re using Hadoop to refine, explore or enrich your data, the interoperability with existing IT infrastructures will be key. That’s why we’re currently seeing immense growth within the Hadoop ecosystem and integration between different vendor solutions. Hadoop has the potential to have a profound impact on the enterprise data landscape, and by understanding the common patterns of use, you can greatly reduce the complexity.

Capital Quiz

Lets find out how many capitals you know.

Avaya released new Mobile app - IP Office 8.0

The mobility application for Avaya IP Office 8.0 is called "one-X Mobile Preferred for IP Office." It currently provides unified communications capabilities on Android-compatible devices; within several weeks, Avaya will be rolling out similar functionality for iPhones.

The feature set was developed with the mobile worker in mind -- someone who increasingly is blending work, home and travel, and needs to be able to access a multitude of applications, Scotto said.

"This change is lifestyle is a big part of why mobility is so critical. People need to be able to, say, conduct conference calls on the road, or follow a meeting even if there is a lot of noise or confusion in the background."

Major enhancements include the ability to initiate an IM with several co-workers from a smartphone; features to manage multiple participants on a conference call; directory access to colleagues while on the road; and new ways to tap into presence.

"You can actually see their presence, both from the perspective of leveraging Microsoft (Nasdaq: MSFT) Exchange -- the system hooks into the Exchange and you can see, for instance, if someone is in a meeting -- and from the perspective of someone's actual location status," Scotto said.

For example, when setting up a teleconference among several colleagues, a user can tell what time zones people currently are in and arrange the meeting accordingly.

Another feature -- visual voicemail -- lets users see all of their business voicemails on their mobile device with date/time information. In addition, users can hear voicemail messages as they are being left, and have the option of interrupting the message to take the call.

Also new to Avaya IP Office 8.0 are the so-caled "serverless" collaboration capabilities delivered through a new Avaya C110 Unified Communications module. It eliminates the need for an external server in IP Office implementations.

IBM researchers make 12-atom magnetic memory bit

Researchers have successfully stored a single data bit in only 12 atoms.

Currently it takes about a million atoms to store a bit on a modern hard-disk, the researchers from IBM say.

They believe this is the world's smallest magnetic memory bit.

According to the researchers, the technique opens up the possibility of producing much denser forms of magnetic computer memory than today's hard disk drives and solid state memory chips.

"Roughly every two years hard drives become denser," research lead author Sebastian Loth told the BBC.

"The obvious question to ask is how long can we keep going. And the fundamental physical limit is the world of atoms.

"The approach that we used is to jump to the very end, check if we can store information in one atom, and if not one atom, how many do we need?" he said.

Below 12 atoms the researchers found that the bits randomly lost information, owing to quantum effects.

A bit can have a value of 0 or 1 and is the most basic form of information in computation.

"We kept building larger structures until we emerged out of the quantum mechanical into the classical data storage regime and we reached this limit at 12 atoms."

Research lead author The groups of atoms, which were kept at very low temperatures, were arranged using a scanning tunnelling microscope. Researchers were subsequently able to form a byte made of eight of the 12-atom bits.

Central to the research has been the use of materials with different magnetic properties.

The magnetic fields of bits made from conventional ferromagnetic materials can affect neighbouring bits if they are packed too closely together.

"In conventional magnetic data storage the information is stored in ferromagnetic material," said Dr Loth, who is now based at the Center for Free-Electron Laser Science in Germany.

"That adds up to a big magnetic field that can interfere with neighbours. That's a big problem for further miniaturisation."

Other scientists thought that was an interesting result.

"Current magnetic memory architectures are fundamentally limited in how small they can go," says Dr Will Branford, of Imperial College London

"This work shows that in principle data can be stored much more densely using antiferromagnetic bits."

But the move from the lab to the production may be some time away.

"Even though I as a scientist would totally dig having a scanning tunnelling microscope in every household, I agree it's a very experimental tool," Dr Loth said.

Dr Loth believes that by increasing the number of atoms to between 150 to 200 the bits can be made stable at room temperature. That opens up the possibility of more practical applications.

"This is now a technological challenge to find out about new manufacturing techniques," he said.

Read More at : http://www-03.ibm.com/press/us/en/pressrelease/36473.wss

Subscribe to General Knowledge