HDFS, which stands for Hadoop Distributed File System, is a core component of the Apache Hadoop framework. It’s a highly scalable, fault-tolerant, and distributed file system designed to store massive datasets across clusters of commodity hardware. Here’s a breakdown of what HDFS is and why it’s so crucial for big data: 1. Distributed Storage: 2. […]
January 12, 2026 | Big Data | No comments
In modern JavaScript, there are often several ways to achieve the same result. A common LeetCode problem, Concatenation of Array, asks us to take an array nums and return a new array that is twice as long, consisting of nums followed by nums. While the task is simple, the way we solve it reveals a […]
January 7, 2026 | Data Structure, Javascript | No comments
The Valid Anagram problem is a classic interview challenge that tests your ability to manage data frequency. While there are many ways to solve it, the journey from using a flexible Map to a high-performance Int32Array reveals a lot about how JavaScript works under the hood. 1. The Foundation: Using a JavaScript Map The most […]
January 7, 2026 | Data Structure, Javascript | No comments
Before we understand LSM, let’s understand the key concepts of SS Tables SS Tables LSM Algorithm Storage of data Compaction Retrieval of data References
June 30, 2025 | Data Structure | No comments
Overview This article lists down (but not limited to), some common unit testing patterns and how to tackle them in Java. It assumes the use of Mockito java test framework for the demonstration purpose. Full code could be found in github. To bring the patterns to life, I have created five simple classes with minimal […]
May 26, 2025 | Java, Unit Testing | No comments
Overview Often there are situations where a change in one of the DB configuration parameters is required. Some of the common scenarios (but not limited to) are: The requirement to adjust these parameters calls for an understanding of the items below: This article addresses each item from the above list. Change Configuration Parameters This official […]
October 27, 2024 | SAP HANA | No comments
Check Linux distribution: Tail all logs in a directory with subfolders: Create a user/password: Add user to sudoers list: Check if the user exist/created: List all sudoers: Switch users: Change password for a user: Find directory: Find file in current directory Find the count of search results Print disk usage of a given folder: Pretty […]
October 11, 2024 | Linux | No comments
Overview This article defines the key components and the setup required to publish an avro schema based serialized message to the Kafka topic. Summary of what this article will cover: Why AVRO Before we get into why AVRO is required, we need to understand why serializing the data at first place. Serialization is a translation […]
October 9, 2024 | Kafka | No comments
Overview Why is codecache critical for understanding the Java application performance. Before we jump into topic, let’s first understand how codecache is related to the code written in a .java file As shown in the figure above, the code written in java file is compiled into bytecode ie. .class file The bytecode is understood by […]
October 9, 2024 | Java | No comments
Overview As a Java Programmer we are sometimes so engrossed in writing the code and getting the feature/story delivered that we often ignore the performance impact of the code written. This generally results in the applications crashing during the stress test or more often in the production environment. The impact of application crashing in the […]
October 9, 2024 | Java | No comments