With Facebook going public, its big data model has attracted more and more attention. The newly built Facebook data center in Prineville, Oregon, is known as the world's most energy-efficient data center. What are the specific features of Facebook's data business? Let's take a brief look at it below. Data collection Timeline Timeline, released in December 2011, mainly adjusts the "Profile". Facebook Profile is equivalent to a person's archive and information, or in layman's terms, a personal homepage. The new personal Profile is more visually impactful than previous versions. Facebook launched a new Timeline interface, which organizes the information posted by individuals on Facebook, such as status, pictures, videos, etc., and displays them in a more structured way, just like an autobiography on Facebook. Like Button This feature allows users to mark their favorite pages and include them in Facebook's search results, which is similar to Google's use of links between pages to determine search rankings. Facebook said: "As long as users click the 'Like' button, all websites that support the Open Graph protocol will be displayed in the search engine." Facebook will use the Open Graph protocol to further expand the scope of the search engine's index, thus posing a threat to Google. Data storage Memcached It is a distributed memory cache system that Facebook uses as a cache layer between web servers and MySQL servers (because database access is relatively slow). Over the years, Facebook has made many optimizations to Memcached and its surrounding software, such as optimizations to the network stack. Facebook has tens of TB of data cached on thousands of Memcached servers at all times. It may be the largest Memcached server cluster in the world. Haystack Haystack is Facebook's high-performance image storage system, but strictly speaking, it is not limited to storing photos. It has to manage more than 20 billion uploaded photos, and each photo is saved in four different resolutions, so there are more than 80 billion photos. Not only does it have to be able to handle hundreds of millions of photos, but performance is also critical. Facebook processes about 1.2 million photos per second, and that's not including the CDN, which is a staggering number. Cassandra Cassandra is a distributed storage system that avoids single points of failure. It is a poster child for the NoSQL movement and has been open sourced. It has even become an Apache project. Facebook uses it in its inbox search, and other sites are using it too. Data analysis Hadoop Architecture Hadoop is the most popular open source tool in distributed/parallel computing today. It is not only a distributed file system for storage, but also can be used to build a large number of cluster computers to achieve distributed storage and archiving of large-scale data sets. Facebook is a loyal user of Hadoop and a contributor to source code. Facebook has also contributed two important Hadoop components, Hive and Thrift, which are currently included in Apache's Hadoop subproject. Hive Hive originated from Facebook. It makes it possible to perform SQL queries on Hadoop, so that non-programmers can also use it easily. Hive is a data warehouse tool based on Hadoop. It can map structured data files into a database table and provide complete SQL query functions. It can convert SQL statements into MapReduce tasks. Zookeeper, Thrift Hadoop's subprojects also include Zookeeper distributed locks, which provide functions similar to Google Chubby. Thrift is Hadoop's cross-language interface that supports multiple languages, such as PHP and Ruby. BigPipe BigPipe is a dynamic web page processing system developed by Facebook. In order to achieve the best performance, Facebook uses it to process each web page in blocks (called "pagelets"). For example, chat windows, news feeds, etc. are transmitted separately in blocks. These pagelets can work in parallel, which not only improves performance, but also does not affect the normal access of users even if part of them fails or interrupts. |
<<: The correct way to "turn on" the "eye protection" desk lamp
Leviathan Press: Personally, I think that in a br...
As winter approaches and the temperature drops sh...
After a woman's reproductive organs are fully...
Many of our friends often experience symptoms of ...
Gynecological diseases are very harmful to women&...
I believe everyone has heard of the disease of ki...
After giving birth, you are finally free, but mot...
Some girls always have delayed menstruation, and ...
After giving birth, the uterus of a pregnant woma...
If you still feel pain down there 20 days after d...
Women's special physiological structure makes...
Many men said that their wives, who originally ha...
In modern times, there are many methods of contra...
After giving birth, the hair becomes oilier and i...
I believe most people know that when giving birth...