Data Engineer - King of Prussia


: $83,690.00 - $125,470.00 /year *

Employment Type

: Full-Time


: Information Technology

Loading some great jobs for you...

Job Summary

RDC is looking for a Data Engineer to be part of a team dedicated to building best-in-class machine learning solutions that protect the world s financial systems. As part of the Architecture team, the Data Engineer will work on the management side of data to make it easy for other systems and people (e.g., Data Science, Development, and Product) to use the data to develop and enhance stable and scalable software solutions.

Essentials Duties and Responsibilities

  • Architect/Design, implement, monitor, and maintain big data pipelines and ETL/ELT pipelines
  • Gather data requirements, capture and maintain technical/operational/business metadata
  • Source data from different systems
  • Store data using the optimal technology (e.g. SQL, NoSQL, HDFS, S3) for the particular use
  • Prepare data for analysis by performing data wrangling/munging
  • Cleanse data
  • Convert data from one format to another
  • De-duplicate data
  • Discover opportunities for data acquisition and pick the right tools to collect and analyze such datasets in batch and/or real-time
  • Recommend and implement methods to improve data governance, security, reliability, efficiency and quality
  • Implement best practices around data modeling, data partitioning and data backfilling on new and existing data
  • Help the team ensure compliance with all regulatory requirements related to data privacy
  • Work closely with the Architecture, Data Scientist, and Tech-Ops teams to ensure efficient and effective delivery of data solutions
  • Interface with Software Engineers, Product Managers and Business Analysts to understand goals, data needs and implement data-driven features/products


  • Bachelor s degree in Computer Science or related field
  • At least 3 years of relevant data engineering experience building, testing and maintaining data a data architecture
  • Strong background in software engineering with 3+ years of experience in software development writing production code, in POSIX shell, bash (or similar), plus at least one other dynamic language (e.g., Perl, Ruby, Python). Java experience desirable but not required.
  • Knowledge of data encoding (ASCII, UTF-8, UTF-16, UCS-2, etc.)

2-3 years professional experience with:

  • Unix shell scripting and tool building
  • ETL/ELT tools and approaches
  • SQL (MSSQL, PostgreSQL) and NoSQL databases (MongoDB)
  • Various data serialization formats such as Apache Avro, Apache Parquet, json, csv, yaml, xml
  • Elasticsearch/Kibana or a similar distributed search and analytics engine

Desirable but not essential:

  • Amazon Web Services (EMR, S3, Glue, IAM, ECS)
  • Kafka, Spark Streaming or a similar real-time stream processing framework
  • Spark ecosystem (Dataframes, MLlib, SparkSQL) & Hadoop ecosystem (HDFS, Hive)
  • ActiveMQ, RabbitMQ or a similar messaging system
  • Databricks/AWS or a similar web-based platform for working with Spark and other Big Data tools
  • Apache Oozie, Apache Airflow, Luigi or a similar workflow management system

Demonstrated proficiency with:

  • Unix/Linux OS
  • Database Management Systems
  • Distributed Systems
  • Big Data concepts/tools

Curious, self-driven, analytical

Demonstrated ability to work with ambiguous requirements, adapt, and learn

Excellent verbal and written communication

Equal Employment Opportunity (EEO)

It is the policy of Regulatory DataCorp, Inc. and Regulatory DataCorp Limited (herein referred to as RDC) to provide equal employment opportunity to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, RDC will provide reasonable accommodations for qualified individuals with disabilities.

Job Description Disclaimer

This job description is not intended as and does not create employment contracts. RDC maintains its status as an at-will employer. All descriptions have been reviewed in an attempt to illustrate the jobs functions and basic duties that illustrate the minimal standards required to successfully perform the positions. The list of duties, responsibilities, and requirements should not be interpreted as all-inclusive. RDC retains the right to change or assign other duties to this position.

by Jobble

Associated topics: data architect, data center, data engineer, data integration, data management, data quality, database, mongo database administrator, sql, sybase * The salary listed in the header is an estimate based on salary data for similar jobs in the same area. Salary or compensation data found in the job description is accurate.

Launch your career - Create your profile now!

Create your Profile

Loading some great jobs for you...