All You Need to Know about Isolation Levels and Read Inconsistencies
Just like real life, the world of computer science is replete with trade-offs. Relational databases are no exception. When interacting with relational databases, we face the dilemma between data consistency and transaction concurrency. The former guarantees the data is trustworthy, while the latter ensures relational databases can conduct transactions swiftly. Both are desirable qualities of relational databases, but we cannot simultaneously achieve them to the fullest extent....CAP Theorem But Better? Introduce the PACELC Theorem
In the previous blog, I introduced the famous CAP Theorem (please give it a read if you haven’t already before you start reading this one). It involves a trilemma of needing to give up one of the following three qualities: consistency, availability, and partition tolerance. Since all three are desirable features of modern-day distributed systems, determining which one to relinquish has become one of the most important and delicate decisions for designers of complicated distributed systems....What You Need to Know about the CAP Theorem
The world that we live in is far from perfect. We constantly find ourselves in dilemmas, sometimes even trilemmas, that require us to make trade-offs. When shopping, we can only choose two out of “cheap,” “fast,” and “good.” In economics, a government cannot enjoy “sovereign monetary policy,” “fixed exchange rate,” and “free capital flow” at the same time....To all new graduates
Those who want to work for a world-changing startup from Japan? Starting your career as a new graduate is one of the most important life events for many people. Over the past 30 years, countries around the world have accelerated their economic ties, leading to globalization. Meanwhile, politically, the world has changed from the relatively stable international situation of the latter half of the 20th century to the VUCA (Volatility, Uncertainty, Complexity, and Ambiguity) era, due to the confrontation between liberal capitalism and former communist countries and the increasing presence of the Global South....A Brief Introduction to Kafka
There’s no denying that we have already ushered in the era of big data. An enormous amount of information is generated every second. While decision-makers can gain invaluable insights from this ever-growing data, its sheer volume also poses considerable challenges to data engineers–greater demand for storage spaces, the need to handle increasingly complex data formats, and highly unpredictable network traffic....Colorkrew Attendance at SXSW March 2023 in Texas with Colorkrew Biz
If you read a lot about innovation and technology, you’ve probably heard or seen some publication involving the acronym SXSW. The acronym means South by Southwest and it is an innovation festival. We can say that it is one of the biggest events of our time involving people from all over the world who debate themes that involve innovation, media, music, behavior and interactivity....A Brief Introduction to the Inner Working of MapReduce
As a data engineer, you probably have heard about Hadoop. It is one of the most popular frameworks for distributed processing of large data sets. It is less costly and more secure than other frameworks. At its center is a programming model called MapReduce. Today we will take a closer look at MapReduce to understand the inner working of Hadoop....ETL vs. ELT: Pick the Most Suitable Data Integration Method for Your Project
As a data engineer, you probably have heard of the data integration methodology called ETL (Extract-Transform-Load). It has been around for a while, and many data engineers have used this methodology to build data pipelines. However, ETL is not the only option up our sleeves. Recently, ELT has also been gaining a lot of popularity....Welcome to Colorkrew Biz Developer Camp
Hi, I’m Alessandro. The Colorkrew Biz team recently went on a company camp at the Enoshima Beach House to make up for the disruptions caused by COVID-19. While the team had done well with remote work, the employer knew that in-person interactions and team bonding were important. We spent two days in a dreamy Beach House in the stunning Enoshima, partying and enjoying yourselves with our teammates....All You Need to Know about Lazy Evaluation in Spark
Few would disagree that the word “lazy” has a negative connotation. We usually describe someone who is not willing to work hard as lazy. However, not all laziness is undesirable, and sometimes we prefer laziness to diligence, especially in the computer science world. One example is lazy evaluation. Today, we will closely inspect why lazy evaluation is essential to Spark’s high performance and how lazy evaluation works in Spark....Parquet Files: Smaller and Faster than CSV
If you have been in the world of big data long enough, you probably have heard about Parquet files. You might even have used it while thinking to yourself: “why can’t we just use CSV files?” Today, I will debunk the mystery of Parquet files and explain why a growing number of data scientists prefer Parquet files to CSV files....Actions, Narrow Transformations, and Wide Transformations
Hi! My name is Ziyu Chen. I am a full-stack engineer at Colorkrew. I love learning and writing about data engineering. Today, I would like to discuss transformations and actions in Spark. Of course, you can dive into the world of Spark and perform ETL processes without knowing how they differ from each other....Dependency Injection with Laravel
Dependency Injection - With Laravel Dependency injection is a commonly used design pattern in object oriented programming. Through some pre-established conventions, we are able to manage the creation of our dependencies more easily. We can declare, replace, or even mock the dependencies as needed without the need to change the code that relies on the dependency....The Power of TypeScript Types
Any frontend developer knows the frustration of seeing the infamous “is not a function” message pop up. It reminds you that, even though your code might achieve your goal, you will not be able to assure its consistency or reliability. Without strict typing there is no stopping tiny slip-ups due to a lack of coffee - or sleep, depending on your dedication....