Amazon Recommends Fluentd as “Best Practice for Data Collection” over Flume and Scribe
Last updated August 20, 2013Here is a guest blog on AWS about using Fluentd to build a unified logging layer
This month, Parviz Deyham from Amazon Web Service promoted Fluentd as the best data collection tool for Amazon Elastic MapReduce (EMR), a hosted Hadoop framework running on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3).
In the best practices whitepaper, Parviz, an Enterpise Solution Architect at AWS, notes that, "Fluentd is easier to install and maintain and has better documentation and support than Flume and Scribe." Collecting data in a scalable and reliable manner has an important place in big data architecture. Many big data analytics solutions fail to provide robust tools for data collection or require that developers write custom data collectors from the origin to the final collection point.
While such attempts to write custom data collectors are important, users can leverage open source frameworks that have already been written to provide scalable and efficient distributed data collection. Open source software is part of our DNA at Treasure Data and we are thrilled that Parviz sees the value in a versatile and lightweight data collection tool like Fluentd to stream data efficiently to the cloud.
We would like to thank the entire Fluentd community for their dedication and contributions to making Fluentd a first class data collection tool. This recommendation from Amazon validates the philosophy and benefits of Fluentd's architecture, maintainability, and simplicity.