Treasure Data CDP Resources

  • Filter by Resource Type
  • Articles
  • Blog
  • Case Studies
  • Cheatsheets
  • Reports
  • Webinars
  • Filter by Industry
  • Automotive
  • CPG
  • Entertainment & Media
  • Financial Services
  • Healthcare
  • Retail
  • Technology
  • Travel & Hospitality
  • Filter by Topic
  • AI & Machine Learning
  • CDP
  • CDP Use Cases
  • Company News
  • Customer Data Strategy
  • Customer Service
  • Data Privacy & Security
  • Marketing
  • Partners
  • Treasure Data CDP

Treasure Data Partners with Sierra Ventures, Raises $5M in Oversubscribed Series A Financing

TD, the cloud-based analytics service, announced today that it has closed $5 million in Series A financing led by Sierra Ventures. This investment will fund global expansion of the company’s operations world-wide. As part of the transaction,...

Treasure Data’s Plazma: Columnar Cloud Storage

TD has been developed by Hadoop experts. We get Hadoop, and, in many ways, it’s part of our core. As we have built out the platform, we noticed that the storage layer needs to be multi-tenant, elastic, and easy to manage while keeping the scalability...

Fluentd + Hadoop: Instant Big Data Collection

Many companies choose Hadoop Distributed Filesystem (HDFS) for big data storage. Until recently, however, the only API interface was Java. This changed with the new WebHDFS interface, which allows users to interact with HDFS via...

Understanding the Book-Crossing Dataset: Setup

I'm a data scientist at TD. In a series of blog entries, I want to introduce how to use our platform by interacting with a concrete dataset. I chose the publicly available Book-Crossing Dataset as our base data...

Log Everything as JSON. Make Your Life Easier

The Story of an Engineer. Here is an anecdote. I am sure some of you have had a similar experience.

Enabling Facebook’s Log Infrastructure with Fluentd

Facebook uses Scribe as its core log aggregation service. The description of Github reads, “Scribe is a server for aggregating log data streamed in real time from a large number of servers.”..

Real-Time Log Collection with Fluentd and MongoDB

For those of you who do not know what MongoDB is, it is an open-source, document-oriented database developed at 10gen, Inc. It is schema-free and uses a JSON-like format to manage semi-structured data...

Fluentd: The Missing Log Collector Software

The fundamental problem with logs is that they are usually stored in files although they are best represented as streams (by Adam Wiggins, CTO at Heroku). Traditionally, they have been dumped into text-based files and collected by rsync in...

MessagePack: The Missing Serializer

MessagePack, the efficient, blazing, fast serializer is the core of our technology. The best way to describe MessagePack is “JSON on steroids”. It supports an almost identical set of data types as JSON —Nil, Boolean, Integer, Float, String, Array, and Associative Array— but runs much faster and requires a fraction of space.

Transform customer data into your most valuable business asset