February, 2023 Scholarnest Blogs

Know about Dimensional Data Modelling for Big Data

Dimensional Data modelling has been a popular and effective approach for designing data warehouses and enabling business intelligence and analytics for many years. However, with the rise of big data, traditional dimensional modelling techniques face new challenges and limitations. Big data is characterized by its sheer volume, velocity, and variety, making it difficult to store, process, and analyze using traditional …

How to Remove Duplicate Rows in a Spark Data Frame

Apache Spark
Apache, Apache Spark, pyspark, Spark Dataframe
February 6, 2023

Apache Spark Data Frame API allows you to read data from various sources and creates a Spark Data Frame out of the source data. However, you may have duplicate rows in your Spark Data Frame. Duplicate rows may show up in your Spark Data Frames for various reasons. Your ETL tool that moves data from one place to another place …

Unlock Apache Spark Memory Allocation – All you want to know

Apache Spark
Memory, Spark, Spark Memory
February 6, 2023

In this blog post, I will explain the memory allocation for the Spark driver and executor. If you are here, I assume you are already familiar with Apache Spark, its architecture, and why Spark needs memory. However, I will quickly revise a few concepts to bring all readers to the same page. Let’s start with the following question. What is …

Unlock Apache Kafka – All you want to know and focus

Apache Kafka
Apache, kafka
February 6, 2023

What is Apache Kafka? Apache Kafka is a distributed streaming platform that is built on the principles of a messaging system. Apache Kafka’s implementation started as a messaging system to create a robust data pipeline. However, over time, Kafka has evolved into a full-fledged streaming platform that offers all the core capabilities to implement stream processing applications over real-time data …

Know about Dimensional Data Modelling for Big Data

How to Remove Duplicate Rows in a Spark Data Frame

Unlock Apache Spark Memory Allocation – All you want to know

Unlock Apache Kafka – All you want to know and focus

Recent Posts

Recent Comments

Archives

Categories

Start Your Digital Journey

COMPANY

LINKS

NEWSLETTER