Job #: 1567

Title: Data Lake Data Architect


  • New York City, NY
  • Job Type:

  • Contract
  • Contract Pay Rate:
  • $100-125
    • Anywhere
    • Posted 2 years ago
    • This position has been filled

    Enterprise Data Lake (EDL) Team Responsibilities:

    The EDL team will oversee and support architecture and implementation of EDL for all big data initiatives. It will drive the data governance and facilitate data onboarding. It will approve the design of data and software architecture, perform architecture review to pass EDL tollgates, evaluate and select cloud/AWS/Big Data tools for acceptance, and serve as a vendor liaison with data lake tool vendors and out internal infrastructure teams. Also, it will certify data for consumption, EDL patterns and processes, manage and govern data access controls, and will manage data lake and data governance training initiatives across enterprise.

    The EDL team will become the center of excellence for the following EDL components and associated tools:

    EDL architecture and patterns
    EDL Data Stores
    EDL Governance
    EDL Data Discovery
    EDL Data Preparation
    EDL Reporting
    EDL Ingestion tools & other technologies and tools
    Educate teams to migrate and develop new cloud applications

    Position Overview:

    We are looking for an accomplished big data architect with strong experience in the cloud AWS data architecture and implementation.

    This role will involve a close collaboration with our team of passionate and innovative big data specialists, application developers and product managers.

    This is a unique opportunity to be a member of our corporate CRM and Analytics Team, tackling our toughest and most exiting data lake challenges across multiple divisions.

    Basic Requirements:

    A minimum of 7 years of architectural experience with:
    big data architecture and technology offerings
    AWS/cloud big data modeling & data management
    analytics and ingestion architecture of big data
    data lake management and data architecture
    data lake design patterns & cloud best enterprise practices
    IoT and streaming, real time processing
    Big data related AWS technologies
    Lead data architecting efforts in researching, identifying and implementing leading edge technologies and practices
    Expert in AWS technologies such as Kinesis, Lambda, EC2, Redshift, RDS, Cloud formation, EMR, AWS S3, AWS Analytics, Spark, Databricks
    Experience with at least one of the following languages Scala, Python, R and or Java
    Experience with designing, developing, and implementing complex integration for end-to-end solutions at a middleware and app level with focus on performance optimization
    Strong implementation skill in area of cloud development in AWS
    Demonstrated ability in implementing cloud scalable, real time and high-performance data lake solutions (AWS)
    Ability to quickly perform proof-of-concepts for validating new technology or approach
    Ability to exercise independent judgment and creative problem-solving techniques in a highly complex environment using leading-edge technology and/or integrating with diverse application systems
    Ability to lead and drive technology change in a fast-paced, dynamic environment and all phases of the entire software life cycle
    Strong experience with data catalog, data governance, Collibra,  MDM and/or Data Quality (IDQ) toolset
    Strong experience with integration of diverse data sources (batch and real time) in the cloud
    Lead the design and sustainment of data pipelines and data storage
    Expertise in Structured, unstructured, SQL and No-SQL technologies
    Expertise with identifying and understanding source data systems and mapping source system attributes to the target
    Experience with design and automation of ETLELT processes
    AWS and cloud performance tuning and optimization experience
    Experience with effort estimation for new projects/proposals on an ongoing basis.
    Excellent communication skills across all levels; ability to communicate with ease the complex and technical concepts.
    Ability to work effectively in a fast-paced environment

    Desired Skills
    Exposure to Big Data Technologies such as MapReduce, Hadoop or other Big Data Platforms
    Exposure to building and deploying data and analytics solutions on AWS or Microsoft Azure cloud platforms
    Exposure to Cognitive computing, ML and AI
    Exposure to graph databases, SPARQL
    Exposure to search technologies like Lucene/Solr or Elastic Search
    Exposure to Ontology & taxonomies
    Exposure to Data Services, API, and OR mapping techniques
    Exposure to Financial Services Industry
    Experience with two or more vendors in any of the following areas including; Informatica, SQL Server, Oracle 11g, MySQL, SQL Data Warehouse Appliance, Oracle Exadata, Netezza, Greenplum, Vertica, Teradata, Aster Data, SAP HANA, Hadoop, SAS, SPSS, Spotfire, Tableau, Qlikview, R, Oracle Endeca, Oracle OBIEE, SAP Business Objects, SAS and other Analytics Vendors with BI components