Articles

What are the steps involved in big data solutions?

by Aarushi Sharma Human Resource Executive

i) Data Ingestion — The foremost step in deploying big data solutions is to extract data from different sources which could be an Enterprise Resource Planning System like SAP, any CRM like Salesforce or Siebel , RDBMS like MySQL or Oracle, or could be the log files, flat files, documents, images, social media feeds. This data needs to be stored in HDFS. Data can either be ingested through batch jobs that run every 15 minutes, once every night and so on or through streaming in real-time from 100 ms to 120 seconds.

ii) Data Storage — The subsequent step after ingesting Big data is to store it either in HDFS or NoSQL database like HBase. HBase storage works well for random read/write access whereas HDFS is optimized for sequential access.

iii) Data Processing — The ultimate step is to process the data using one of the processing frameworks like mapreduce, spark, pig, hive, etc.


Sponsor Ads


About Aarushi Sharma Senior     Human Resource Executive

212 connections, 11 recommendations, 640 honor points.
Joined APSense since, June 6th, 2019, From New Delhi, India.

Created on Nov 15th 2019 04:32. Viewed 317 times.

Comments

No comment, be the first to comment.
Please sign in before you comment.