Treasure Data CDP Resources

  • Filter by Resource Type
  • Articles
  • Blog
  • Case Studies
  • Cheatsheets
  • Reports
  • Webinars
  • Filter by Industry
  • Automotive
  • CPG
  • Entertainment & Media
  • Financial Services
  • Healthcare
  • Retail
  • Technology
  • Travel & Hospitality
  • Filter by Topic
  • AI & Machine Learning
  • CDP
  • CDP Use Cases
  • Company News
  • Customer Data Strategy
  • Customer Service
  • Data Privacy & Security
  • Marketing
  • Partners
  • Treasure Data CDP

The 4 Important Things About Analyzing Data Part 2: Understand the Purpose of the Analysis and Who Needs the Results

Before analyzing data, it is important to first clearly understand for whom and for what purpose you are conducting the analysis. This is essential because analytics assist humans in making decisions. Therefore, conducting the analysis to produce the best results for the decisions to be made is an important part of the process, as is ... The 4 Important Things About Analyzing Data Part 2: Understand the Purpose of the Analysis and Who Needs the Results

Why the Unified Logging Layer Matters

The amount of logs produced today is staggering. The logs provide opportunities for analysis to better understand customers and continually improve products. The log collection pipeline, then, becomes a source of valuable data. Collecting and unifying the data for better consumption and analysis can be a challenge. It is important to understand the nuances of ... Why the Unified Logging Layer Matters

Treasure Data Joins the Linux Foundation

Today is a big step forward for our customers and community in general, as we officially join the Linux Foundation. As you may know, our company is driven by an open source culture: We believe that continuous innovation,...

Presto versus Hive: What You Need to Know

There is much discussion in the industry about analytic engines and, specifically, which engines best meet various analytic needs. This post looks at two popular engines, Hive and Presto, and assesses the best uses for each. How Hive Works Hive translates SQL queries into multiple stages of MapReduce and it is powerful enough to handle ... Presto versus Hive: What You Need to Know

Four Reasons Presto is the Best SQL-on-Hadoop (That You Haven’t Heard Of)

Presto is an in-memory distributed SQL query engine developed by Facebook that has been open-sourced since November 2013. Presto has a number of key advantages over other SQL-on-Hadoop engines, yet these benefits are not widely recognized or understood. Reason #1: Presto is Plenty Fast Unlike MapReduce, which was designed for very high throughput at the ... Four Reasons Presto is the Best SQL-on-Hadoop (That You Haven’t Heard Of)

The 4 Important Things about Analyzing Data Part 1: The Importance of Providing Many ‘Obvious’ Results

In the past few years, mass accumulation of data has increased. Meanwhile, the distributed parallel processing of the data has matured. Although the original focus was on the analysis back end (platforms) to support the accumulation of data and the efficiency of the batches, I believe that the importance of the “data,” the “analysis,” and that of ... The 4 Important Things about Analyzing Data Part 1: The Importance of Providing Many ‘Obvious’ Results

Eliminating Schema Rot in MPP Databases Like Redshift

The MPP database is an incredible piece of technology. These databases run large-scale analytic queries very quickly, making them great tools for iterative data exploration. With a cloud offering like Redshift in the market, MPP databases are enjoying increasing adoption today outside of enterprise IT. However, like any other great technology, they excel in some ... Eliminating Schema Rot in MPP Databases Like Redshift

Managing the Data Pipeline with Git + Luigi

One of the common pains of managing data, especially for larger companies, is that a lot of data gets dirty (which you may or may not even notice!) and becomes scattered around everywhere. Many ad hoc scripts are running in different places, these scripts silently generate dirty data. Further, if and when a script results ... Managing the Data Pipeline with Git + Luigi

Learn SQL by Calculating Customer Lifetime Value Part 2: GROUP BY and JOIN

This is the second installment of our SQL tutorial blog series. In the first part, we set up the data source with SQLite and learned how to filter and sort data. This time, we will learn two other key concepts in SQL: GROUP BY and JOIN. Get the FREE e-book based on this blog series! ... Learn SQL by Calculating Customer Lifetime Value Part 2: GROUP BY and JOIN

Transform customer data into your most valuable business asset