• The Whys and Hows of Promise in JavaScript

    If you have done any web development after 2015, chances are that you have heard of the concept of Promise. It wouldn’t be an overstatement to claim that Promise is ubiquitous in modern-day front-end codebases. However, many web developers—especially those who do not have much experience in front-end development—have been using Promise without thoroughly understanding its inner working. This has led to countless misuses of Promise and consequent bugs. In this article, I will walk you through what motivated the creation of Promise, how Promise works, and how we should use it in our code, under the assumption that you don’t have much experience in front-end development.

    ...
  • All You Need to Know about Isolation Levels and Read Inconsistencies

    Just like real life, the world of computer science is replete with trade-offs. Relational databases are no exception. When interacting with relational databases, we face the dilemma between data consistency and transaction concurrency. The former guarantees the data is trustworthy, while the latter ensures relational databases can conduct transactions swiftly. Both are desirable qualities of relational databases, but we cannot simultaneously achieve them to the fullest extent. Today, I will discuss how isolation levels can help us structure our decision-making regarding this dilemma.

    ...
  • CAP Theorem But Better? Introduce the PACELC Theorem

    In the previous blog, I introduced the famous CAP Theorem (please give it a read if you haven’t already before you start reading this one). It involves a trilemma of needing to give up one of the following three qualities: consistency, availability, and partition tolerance. Since all three are desirable features of modern-day distributed systems, determining which one to relinquish has become one of the most important and delicate decisions for designers of complicated distributed systems. While the CAP Theorem is widely-known in computer science, its extension, the PACELC theorem, has received less attention. Today, I will shine a long-overdue spotlight on the PACELC theorem.

    ...
  • What You Need to Know about the CAP Theorem

    The world that we live in is far from perfect. We constantly find ourselves in dilemmas, sometimes even trilemmas, that require us to make trade-offs. When shopping, we can only choose two out of “cheap,” “fast,” and “good.” In economics, a government cannot enjoy “sovereign monetary policy,” “fixed exchange rate,” and “free capital flow” at the same time. It can only achieve two of them by giving up the third. Similarly, the CAP (consistency, availability, and partition tolerance) theorem involves an equally head-scratching trilemma that has troubled computer scientists and software engineers ever since distributed computing became a popular solution to large-scale computation. Today, we will dive deep into the CAP theorem and learn how to make wise trade-offs based on our needs.

    ...
  • To all new graduates

    Those who want to work for a world-changing startup from Japan?

    Starting your career as a new graduate is one of the most important life events for many people.

    Over the past 30 years, countries around the world have accelerated their economic ties, leading to globalization.

    Meanwhile, politically, the world has changed from the relatively stable international situation of the latter half of the 20th century to the VUCA (Volatility, Uncertainty, Complexity, and Ambiguity) era, due to the confrontation between liberal capitalism and former communist countries and the increasing presence of the Global South.

    ...
  • A Brief Introduction to Kafka

    There’s no denying that we have already ushered in the era of big data. An enormous amount of information is generated every second. While decision-makers can gain invaluable insights from this ever-growing data, its sheer volume also poses considerable challenges to data engineers–greater demand for storage spaces, the need to handle increasingly complex data formats, and highly unpredictable network traffic. Luckily, recent years have witnessed the creation of various technologies devoted to efficiently digesting big data, and Kafka is one of them. Today, I will demonstrate how Kafka works to help kick-start your Kafka journey.

    ...
  • Colorkrew Attendance at SXSW March 2023 in Texas with Colorkrew Biz

    If you read a lot about innovation and technology, you’ve probably heard or seen some publication involving the acronym SXSW.

    The acronym means South by Southwest and it is an innovation festival. We can say that it is one of the biggest events of our time involving people from all over the world who debate themes that involve innovation, media, music, behavior and interactivity. The purpose of the event is to expose and provide space for people in the technology area to share their ideas and experiences. In this post we are going to talk in more detail about SXSW.

    ...
  • A Brief Introduction to the Inner Working of MapReduce

    As a data engineer, you probably have heard about Hadoop. It is one of the most popular frameworks for distributed processing of large data sets. It is less costly and more secure than other frameworks. At its center is a programming model called MapReduce. Today we will take a closer look at MapReduce to understand the inner working of Hadoop.

    ...
  • ETL vs. ELT: Pick the Most Suitable Data Integration Method for Your Project

    As a data engineer, you probably have heard of the data integration methodology called ETL (Extract-Transform-Load). It has been around for a while, and many data engineers have used this methodology to build data pipelines. However, ETL is not the only option up our sleeves. Recently, ELT has also been gaining a lot of popularity. In this article, I will compare ETL and ELT to help you understand their respective advantages and drawbacks so that you can choose the methodology more suitable for your project.

    ...
  • Welcome to Colorkrew Biz Developer Camp

    Hi, I’m Alessandro.

    The Colorkrew Biz team recently went on a company camp at the Enoshima Beach House to make up for the disruptions caused by COVID-19. While the team had done well with remote work, the employer knew that in-person interactions and team bonding were important.

    We spent two days in a dreamy Beach House in the stunning Enoshima, partying and enjoying yourselves with our teammates. But that’s not all - in between all the fun, we also had the opportunity to tackle interesting technical challenges and come up with improvements for our product. It was the perfect blend of work and play!

    ...
  • All You Need to Know about Lazy Evaluation in Spark

    Few would disagree that the word “lazy” has a negative connotation. We usually describe someone who is not willing to work hard as lazy. However, not all laziness is undesirable, and sometimes we prefer laziness to diligence, especially in the computer science world. One example is lazy evaluation. Today, we will closely inspect why lazy evaluation is essential to Spark’s high performance and how lazy evaluation works in Spark.

    ...
  • Parquet Files: Smaller and Faster than CSV

    If you have been in the world of big data long enough, you probably have heard about Parquet files. You might even have used it while thinking to yourself: “why can’t we just use CSV files?” Today, I will debunk the mystery of Parquet files and explain why a growing number of data scientists prefer Parquet files to CSV files.

    ...
  • Actions, Narrow Transformations, and Wide Transformations

    Hi! My name is Ziyu Chen. I am a full-stack engineer at Colorkrew. I love learning and writing about data engineering.

    Today, I would like to discuss transformations and actions in Spark. Of course, you can dive into the world of Spark and perform ETL processes without knowing how they differ from each other. However, once you are in charge of optimizing pre-existing ETL processes or building new ones from scratch, not understanding their differences might sabotage the efficiency of the ETL processes that you create. In the end, mediocrely-designed inefficient ETL processes might even increase the operation cost of your organization. Therefore, a thorough comprehension of the essence of transformations and actions becomes necessary if you want to build high-performance ETL processes using Spark.

    ...
  • Dependency Injection with Laravel

    Dependency Injection - With Laravel

    Dependency injection is a commonly used design pattern in object oriented programming. Through some pre-established conventions, we are able to manage the creation of our dependencies more easily. We can declare, replace, or even mock the dependencies as needed without the need to change the code that relies on the dependency.

    ...