Data Engineering Intern(Spring/Summer 2026)

Work from home Full-time role Hiring

Description:

Support the development and maintenance of data pipelines using Databricks, Spark, and similar technologies.
Write and optimize SQL and Python scripts for data transformation, integration, and automation tasks.
Develop automation scripts that populate metadata and comments across Databricks tables using structured definitions such as CSV files.
Assist in building a proof-of-concept for an automated data dictionary maintained with existing Databricks metadata.
Contribute to prototyping an AI-powered knowledge agent that uses internal data and documentation to answer common questions.
Collaborate with team members to improve data quality, cataloging, and metadata management across the ecosystem.
Participate in code reviews, design discussions, and sprint ceremonies to learn engineering best practices.
Document findings, workflows, and automation processes for future reuse.
Perform other duties as assigned.

Requirements:

Actively pursuing a Bachelor’s or Master’s degree in Computer Science, Software Engineering, Information Systems, or a related technical field.
Foundational knowledge of Python and SQL for data manipulation and analysis.
Familiarity with ETL concepts and structured data formats such as CSV, JSON, and Parquet.
Interest in cloud-based data platforms, with Azure preferred.
Strong analytical and problem-solving skills with an eagerness to learn.
Effective communication and teamwork skills.
Exposure to Databricks, Apache Spark, or other distributed data frameworks is preferred.
Familiarity with Git or version control practices is preferred.
Interest in AI/LLM-based automation, data documentation, or metadata management is preferred.
Prior project or internship experience in data engineering or cloud technologies is preferred.

Apply tot his job Apply To this Job

You might like