Skip to main content
LTTS

LTTS

Quick Links mini

  • Search
  • info@LTTS.com
  • English
  • 日本語
  • Israel
  • German
  • Contact
  • Engineering the change
  • Industry
    • Communication
      • 5G
      • SDN/NFV
      • Wireless Networks
      • Wireline Networks
    • Consumer Electronics
      • Consumer IoT
      • Enterprise Devices
      • Personal Devices
    • Healthcare
      • Healthcare Providers
      • Medical Devices
    • Industrial Products
      • Building Solutions
      • Lighting Engineering
      • Power electronics & drives
      • Renewable Energy
      • Test & Measurement
      • Power Generation & Transmission
    • Media & Entertainment
      • Cable & Broadcasting
      • OTT
      • Rdk
      • Set-Top Boxes
      • Smart Home
    • Oil & Gas
      • Digitalization
      • Oil Field Equipment
      • Owners and Operators
    • Plant Engineering
      • CAPEX Project E/EPCM Services
      • Digital Engineering Services
      • Engineering Reapplication & Global Rollouts
      • Integrated Asset Management Services
      • Operational excellence
      • Plant sustenance & management
      • Regulatory compliance engineering
    • Semiconductors
      • IP Core Solutions
    • Software Products
    • Transportation
      • Aerospace
      • Automotive
      • Rail Transportation
      • Travel & Hospitality
      • Trucks & Off-Highway Vehicles
  • Services
    • Digital
      Engineering & Consulting
      • Cybersecure
        • Security Monitoring
        • Security Services
        • Security Solutions
      • Immersive Experiences
      • Industry 4.0
      • Product Consulting
      • Sustainability Engineering
      • 5G
    • Product
      Engineering
      • Digital Engineering
        • Cloud Engineering
        • DevOps
        • Immersive Experiences
        • User Experience
      • Embedded Engineering
        • Embedded Systems
        • Sustenance
        • VLSI
        • Wearables Engineering
      • Mechanical Design
        • CAE & CFD
        • CAx Automation
      • Software Engineering
        • Engineering Analytics
        • Sustenance & Maintenance
        • User Experience Design
        • Voice Innovations
      • Testing & Validation
        • Integrated Design, Validation & Testing
        • Lab as a Service
        • Testing
    • Manufacturing
      Engineering
      • Smart Manufacturing
        • Accelerated Operations
        • Digital Factory & Simulations
        • Plant Design & Engineering
      • Manufacturing & Planning
        • Accelerated Operations
        • Digital Factory & Simulations
        • Line Expansion & Transfer
        • Manufacturing Automation
        • New Product Development
        • Plant Design & Engineering
        • PLM on Cloud
      • Manufacturing Execution
        • Agile Supply Chain
        • Content Engineering
        • Material & Parts Management
        • Sourcing & Procurement
    • Operations
      Engineering
      • Connected Product Support
        • DevOps
      • Supply Chain Engineering
        • Sourcing & Procurement
      • Plant Engineering
        • CAPEX Project E/EPCM Services
        • Operational Excellence
        • Plant Sustenance & Management
        • Material & Parts Management
        • Regulatory Compliance Engineering
  • Solutions
    • AiCEArtificial Intelligence Clinical Evaluation
    • AiKno™ Machine Learning, NLP & Vision Computing
    • AnnotAiAI Based Data Annotation Tool
    • ARC Asset Reliability Centre
    • Avertle®AI Predictive Maintenance Solution
    • CHEST-rAi™ AI Chest X-Ray Radiology Assist Suite
    • Cogmation Device Test Automation Framework
    • Connected Security Integrative Zero Trust Architecture
    • ESM Energy and Sustainability Manager
    • FlyBoard®Advanced Digital Signage Solution
    • i-BEMSIntelligent Building Experience Management
    • nBOnnB-IoT Protocol Stack
    • Semiconductor IP For Security, Communication & Verification
    • UBIQWeise 2.0 Device to Cloud IoT Platform
  • Insights
    • Blogs
      • Embedding Innovation in Semiconductors with Platform Software and Middleware
      • Towards a Connected, Collaborative, & Intuitive Tomorrow: Redefining the Future with a Blink
      • 6 DevOps Trends to Watch Out for in 2023
    • News
      • L&T Technology Services unveils CoE to offer suite of automotive solutions on AWS
      • LTTS Selected as Strategic Engineering Partner to Airbus for Advanced Capabilities and Digital Manufacturing Services
      • LTTS agrees to acquire Smart World & Communication business of L&T
    • POV
      • Hidden Correlations Shaping the Future of European Enterprises
      • From the Super Bowl to 5G
      • Vehicle-to-Everything (V2X) Enabling Smarter, Safer, and Greener Transportation
      • Will 5G Technologies Drive a New Future for Media & Entertainment?
    • eBooks
      • The Art of Cyberwar
      • Digital Twin - The Future of Manufacturing
      • Digitalising Wind Energy Ecosystem
      • INDUSTRY 4.0: The Future Is Now
      • Digital Engineering Explained
      • Sustainability Engineering
  • Explore LTTS
    • About Us
    • Accolades
    • Alliances
    • Analysts
    • Board of Directors
    • Careers
    • CSR
    • Events & Webinars
    • Investors
    • Media Kit
    • Nearshore Centers
    • News & Media
    • Quality Management
    • Resources
    • Sustainability
    • Testimonials
  • Contact
 

Cloud engineering

Evolution of Data Pipelines

  1. Home
  2. Blogs
  3. Spotlight
  4. Evolution of Data Pipelines

Evolution of Data Pipelines

Evolution of Data Pipelines
Published on: 11 Feb, 2022
1090 Views
0 comments
Share This Article:
  • Twitter
  • Facebook
  • Linked in
Data Pipelines
Data Warehouses
Cloud
Data Lakes
Data Pipeline Architectures
Data Extraction
Data Mesh

In the past, when data had to be updated, operators manually entered it into a data table. This would lead to manual user entry errors and time lag. Since this was majorly done in batches, mostly as a daily job, there was substantial lead time from the time the event occurred to the time it was reported. Decision makers had to live with this time lag and often make decisions on stale data.

Fast forward into the present and now we have real-time updates and insights which are common place requirements. Building data pipelines essentially was with the intent to move data from one layer (transactional or event sources) to data warehouses or lakes where insights where derived.

The question is with these advancements in requirements to support real-time insights, and other quality requirements, are we efficient by using traditional architectures or popularly used ETL approaches. Let’s find out!

Current state of Data Pipeline Architectures and Challenges

Data pipelines is important to any Product Digitization program. Later half of this decade we witnessed immense focus on Digital architecture and technologies being adopted. Adoption of microservices and containerization is only seeing a strong growth trajectory establishes this fact. We also see tech advancements being applied but limited to traditional “OLTP” or core service/business logic.

However, the story is bit different, when one inspects the patterns involved in Data pipelines or “OLAP” side of things. We observe limited adaptation to tech evolution seen in core services space. Most common data pipelines are built using either traditional ETL, or ELTL architectures. These are popular industry de-facto approaches. Though these do solve the larger problem at hand i.e. deriving actionable insights, but it also comes with certain limitations. Let’s explore some of these challenges:

Siloed Teams: The ETL process requires expertise or skills in data extraction or migration. This could mean the technical team is layered or structured to deal with technical nuances of the process. E.g.: An ETL engineer is many a times oblivious to insights being derived and how it is consumed by end users.

Limited Manifestation: The implementation team is now trying to fit any use-case that is desired in to the set structure or pattern. Though this is always not a problem or a wrong thing to do, there are times this can be more in-efficient. E.g.: How does one extract from an unstructured source and deal with modelling the intermediate persistence schema?

Latency: Time taken to process extract, transform and load the data many a times does introduce lags. This lag could be attributed to the fact that data is processed in batches, or the necessary intermediate load steps to persist interim results. In few business scenario, this is not acceptable.  E.g.: Data streams emanating from an IoT service is stored and batch processed at a later scheduled time. Thereby, introducing a lag from data generation to updated insights on dashboards.

Future state of Data Pipeline Architecture and Key considerations

As we see advancements in general software architecture like Microservice, Service Mesh, and so on, there is a need for similar modernization. One key approach emerging is distributing the data pipeline for the domains instead of centralized data pipeline contributing to build multiple such objects resulting in Data Mesh. Data Mesh aims to address these challenges by adopting a different approach:

  • Team or pods that are aligned on functional feature delivery
  • Treat Data as Product (discoverable, self-contained and secure)
  • Polyglot storage and communication facilitate via Mesh

Initial read on Data Mesh can be found here.

Data Mesh can be implemented in various ways. One effective pattern is to use Event driven approach and Event storming to form Data Products. A Domain can comprise of one or more Data Products. This would also mean that data can be redundant and persisted in one or more stores. This is referred to as Polyglot storage. Finally, these data products are consumed via the Mesh APIs designed along the lines of each domain requirement.

Other architectural styles include Data Lake, Data Hub and Data Virtualization. A brief comparison on these can be found here.

Some other considerations that one should evaluate:

  • Facilitate easy data access any time use standard interfaces like SQL. Tech like Snowflake, DBT, Materialize enable such real-time joins which not only enables BI, but also helps in low level plumbing of the pipeline
  • Design Data Pipelines to be robust and fault tolerant, E.g. checkpoint intermediate results where required for further analysis
  • Leverage distributed loosely-couple processing units, scalable to use polyglot technologies e.g. Spark job or Python models
  • Use Data Virtualization to mitigate bottlenecks, E.g. shorten lead time for data availability
  • Use of DataOps effectively to track and evaluate your Data pipeline performance

Conclusion

Finally, I would like to conclude with a disclaimer. This article is not to discard current architectures associated to ETL. In fact, for certain use cases like batch jobs, ETL is still a very good option to adopt. The intent here is more of a realization one would need to have based on the varied requirements and explore further architectures which could suit well for the need. In this article, we looked at few such architectures like Data Mesh and associated areas one needs to consider.

Feel free to drop your comments, feedback, queries on this article, I will try and answer each of those at my earliest convenience.

Authors

Salman Hamza Hussain, Practice Head, Digital Architecture & Analytics
Salman Hamza Hussain
Practice Head, Digital Architecture & Analytics

Related Blogs

Shriharsha Bhat
Cloud Migration: A Short Story on Transforming the Engineering Applications Landscape
01 Jun, 2022
Ganesh S K
Understanding Data Warehouses, Data Lakes, & Data Mesh: A Quick Primer for Business Success
14 Sep, 2021
Snehal Oza
Connected Manufacturing: Blurring the Lines with IIOT
06 Jun, 2017
Leave a Comment
About text formats

Comments

No Comments

×Explore
  • Industry
  • Spotlight
  • ×
  • Automotive
  • Consumer Electronics
  • Industrial Engineering
  • Lighting & Building Solutions
  • Media & Entertainment
  • Medical Devices
  • Oil & Gas
  • Plant Engineering
  • Power Electronics
  • Renewable Energy
  • Semiconductors
  • Industrial Products
  • Transportation
  • Telecommunications
  • 5G
  • Cloud engineering
  • Cyber security
  • Embedded systems
  • Industry 4.0
  • Smart Manufacturing
  • Smart Products
  • VLSI
  • Sustainability
  • Artificial Intelligence
  • AR/VR
  • Image Processing
  • Connected Healthcare
  • Smart Factory
  • Digital Twins
  • Building Automation
  • Autonomous Transport
  • Robotics
  • Digital Entertainment
  • Machine Learning
  • UI/UX
  • Manufacturing Automation
  • Smart Sourcing
  • Simulation
  • Software Defined Networking
  • Telehealth
  • Wearables
  • Design Thinking
  • IoT Security
  • ER&D Hackathon 2019
  • Digital Media
  • The New Normal
  • Data Mesh
  • DevOps
  • Media
  • parent-company-logo.png
  • Need Help
  • Contact Us
  •  

Contact Us

By clicking Submit, you agree to the Privacy Policy

  • Engineering the change
  • Industry
    • Communication
    • Consumer Electronics
    • Healthcare
    • Industrial Products
    • Media & Entertainment
    • Oil & Gas
    • Plant Engineering
    • Semiconductors
    • Software Products
    • Transportation
  • Services
    • Digital
      • Cybersecure
      • Security Monitoring
      • Security Solutions
      • Security Services
      • Immersive Experiences
      • Industry 4.0
      • Product Consulting
      • Sustainability Engineering
      • 5G
    • Products
      • CAE & CFD
      • CAx Automation
      • Cloud Engineering
      • DevOps
      • Embedded Systems
      • Engineering Analytics
      • Integrated Design, Validation & Testing
      • Lab as a Service
      • Sustenance
      • Testing
      • Testing & Validation
      • User Experience
      • VLSI
      • Voice Innovations
      • Wearables Engineering
    • Manufacturing
      • Accelerated Operations
      • Agile Supply Chain
      • Content Engineering
      • Digital Factory & Simulations
      • Line Expansion & Transfer
      • Manufacturing Automation
      • New Product Development
      • Plant Design & Engineering
      • PLM on Cloud
      • Sourcing & Procurement
    • Operations
      • CAPEX Project E/EPCM Services
      • DevOps
      • Material & Parts Management
      • Operational Excellence
      • Plant Sustenance & Management
      • Regulatory Compliance Engineering
      • Sourcing & Procurement
  • Solutions
    • AiCE
    • AiKno™
    • AnnotAi
    • ARC
    • Avertle®
    • Chest-rAi™
    • Cogmation
    • Connected Security
    • ESM
    • FlyBoard®
    • i-BEMS
    • nBOn
    • Semiconductor IP
    • UBIQWeise 2.0
  • Insights
    • Blogs
    • News
    • POV
    • eBooks
  • Explore LTTS
    • About Us
    • Accolades
    • Alliances
    • Analysts
    • Board of Directors
    • Careers
    • CSR
    • Events & Webinars
    • Investors
    • Media Kit
    • Nearshore Centers
    • News & Media
    • Quality Management
    • Resources
    • Sustainability
    • Testimonials
  •  
  •  
  •  
  •  
  •  
^
  •  
  •  
  •  
  •  
  •  

© 2023 L&T Technology Services Limited. All Rights Reserved.

  • COPYRIGHT & TERMS
  • PRIVACY
  • Site Map
  • info@LTTS.com