Data Pipeline Engineer Job at Mega Cloud Lab, San Jose, CA

S0ZReHN6ckl2TGtNWjNFejZnb005cnlkb3c9PQ==
  • Mega Cloud Lab
  • San Jose, CA

Job Description

Overview We are seeking a skilled Data Pipeline Engineer with deep expertise in building, orchestrating, and optimizing large-scale data ingestion pipelines. This role is perfect for someone who thrives on working with high-volume telemetry sources, refining complex data workflows, and solving challenges like schema drift in a distributed systems environment. Location: San Jose, CA (Onsite 2 days per week). Final-round interviews will be conducted in person. Key Skills: Proven experience designing and building multiple data pipelines, with deep expertise in Airflow, Kafka, Python (PySpark), and cloud platforms. Must have hands-on experience with large-scale data warehouses (managing multiple TBs). Key Responsibilities Design, build, and manage scalable batch and real-time streaming pipelines for ingesting telemetry, log, and event data. Develop, implement, and maintain robust data orchestration workflows using tools like Apache Airflow or similar platforms. Onboard new data sources by building efficient connectors (API, Kafka, file-based) and normalizing diverse, security-related datasets. Proactively monitor and manage schema evolution and drift across various source systems and data formats. Implement comprehensive pipeline observability, including logging, performance metrics, and alerting systems. Continuously optimize data ingestion for enhanced performance, reliability, and cost-effectiveness. Collaborate with cross-functional teams, including detection, threat intelligence, and platform engineering, to align data ingestion with security objectives. Required Qualifications 5+ years of professional experience in data engineering or infrastructure roles with a focus on pipeline development. Strong proficiency in Python and extensive experience with distributed data processing frameworks like Apache Spark/PySpark. Hands-on experience with orchestration and workflow management tools such as Apache Airflow, Dagster, or Prefect. Deep understanding of data ingestion patterns, schema management, and strategies for handling schema drift. Practical experience with messaging/streaming platforms (e.g., Kafka) and cloud-native storage services (e.g., S3). Proven experience developing solutions in a major cloud environment (AWS preferred, Azure, or GCP). End of description. #J-18808-Ljbffr Mega Cloud Lab

Job Tags

2 days per week,

Similar Jobs

ADEX Healthcare Staffing LLC

Travel Emergency Department Registered Nurse Job at ADEX Healthcare Staffing LLC

 ...and stroke patients for transport \n Manage trauma cases and military patients \n Administer procedural sedation, TNK, and common...  ...security personnel on-site around the clock. The facility serves both civilian and military populations, offering stabilization services for... 

ChipStack

Office Manager -- ChipStack.ai (San Jose Office) Job at ChipStack

 ...lifecycle. Backed by top-tier investors (Khosla, Cerberus, Clear Ventures), and led by industry veterans from Google, Qualcomm, and Nvidia, were building a new generation of tools for the chip industry. We are a fast-growing team of engineers, ML scientists, and... 

United Parcel Service

Package Handler - Hiring ASAP Job at United Parcel Service

Package Handler - Hiring ASAP at United Parcel Service summary: Seasonal Package Handlers at UPS are responsible for loading and unloading packages from trailers and trucks in a fast-paced warehouse environment. The role requires physical stamina, the ability to lift...

Bop LLC

Apparel Stylist, Shopbop Job at Bop LLC

 ...DESCRIPTION Shopbop seeks a versatile Stylist with a strong fashion sense and comprehensive knowledge of current trends, designers, and brands across women's and menswear. This dynamic role offers an opportunity to work with trend-forward product in our Madison, WI... 

Labcorp

HVACR Technician Job at Labcorp

 ...from a trade school is preferred Prior experience in a similar HVAC position; 3 years is preferred EPA certified Universal...  ...medical conditions), family or parental status, marital, civil union or domestic partnership status, sexual orientation, gender identity...